Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Bayesian linear regression

From Wikipedia, the free encyclopedia
Method of statistical analysis
Part of a series on
Bayesian statistics
Posterior =Likelihood ×Prior ÷Evidence
Background
Model building
Posterior approximation
Estimators
Evidence approximation
Model evaluation
Part of a series on
Regression analysis
Models
Estimation
Background
Not to be confused withBayes linear statistics.

Bayesian linear regression is a type ofconditional modeling in which the mean of one variable is described by alinear combination of other variables, with the goal of obtaining theposterior probability of the regression coefficients (as well as other parameters describing thedistribution of the regressand) and ultimately allowing theout-of-sample prediction of theregressand (often labelledy{\displaystyle y})conditional on observed values of the regressors (usuallyX{\displaystyle X}). The simplest and most widely used version of this model is thenormal linear model, in whichy{\displaystyle y} givenX{\displaystyle X} is distributedGaussian. In this model, and under a particular choice ofprior probabilities for the parameters—so-calledconjugate priors—the posterior can be found analytically. With more arbitrarily chosen priors, the posteriors generally have to be approximated.

Model setup

[edit]

Consider a standardlinear regression problem, in which fori=1,,n{\displaystyle i=1,\ldots ,n} we specify the mean of theconditional distribution ofyi{\displaystyle y_{i}} given ak×1{\displaystyle k\times 1} predictor vectorxi{\displaystyle \mathbf {x} _{i}}:yi=xiTβ+εi,{\displaystyle y_{i}=\mathbf {x} _{i}^{\mathsf {T}}{\boldsymbol {\beta }}+\varepsilon _{i},}

whereβ{\displaystyle {\boldsymbol {\beta }}} is ak×1{\displaystyle k\times 1} vector, and theεi{\displaystyle \varepsilon _{i}} areindependent and identicallynormally distributed random variables:εiN(0,σ2).{\displaystyle \varepsilon _{i}\sim N(0,\sigma ^{2}).}

This corresponds to the followinglikelihood function:

ρ(yX,β,σ2)(σ2)n/2exp(12σ2(yXβ)T(yXβ)).{\displaystyle \rho (\mathbf {y} \mid \mathbf {X} ,{\boldsymbol {\beta }},\sigma ^{2})\propto (\sigma ^{2})^{-n/2}\exp \left(-{\frac {1}{2\sigma ^{2}}}(\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})\right).}

Theordinary least squares solution is used to estimate the coefficient vector using theMoore–Penrose pseudoinverse:β^=(XTX)1XTy{\displaystyle {\hat {\boldsymbol {\beta }}}=(\mathbf {X} ^{\mathsf {T}}\mathbf {X} )^{-1}\mathbf {X} ^{\mathsf {T}}\mathbf {y} }

whereX{\displaystyle \mathbf {X} } is then×k{\displaystyle n\times k}design matrix, each row of which is a predictor vectorxiT{\displaystyle \mathbf {x} _{i}^{\mathsf {T}}}; andy{\displaystyle \mathbf {y} } is the columnn{\displaystyle n}-vector[y1yn]T{\displaystyle [y_{1}\;\cdots \;y_{n}]^{\mathsf {T}}}.

This is afrequentist approach, and it assumes that there are enough measurements to say something meaningful aboutβ{\displaystyle {\boldsymbol {\beta }}}. In theBayesian approach,[1] the data is supplemented with additional information in the form of aprior probability distribution. The prior belief about the parameters is combined with the data's likelihood function according toBayes' theorem to yield theposterior belief about the parametersβ{\displaystyle {\boldsymbol {\beta }}} andσ{\displaystyle \sigma }. The prior can take different functional forms depending on the domain and the information that is availablea priori.

Since the data comprises bothy{\displaystyle \mathbf {y} } andX{\displaystyle \mathbf {X} }, the focus only on the distribution ofy{\displaystyle \mathbf {y} } conditional onX{\displaystyle \mathbf {X} } needs justification. In fact, a "full" Bayesian analysis would require a joint likelihoodρ(y,Xβ,σ2,γ){\displaystyle \rho (\mathbf {y} ,\mathbf {X} \mid {\boldsymbol {\beta }},\sigma ^{2},\gamma )} along with a priorρ(β,σ2,γ){\displaystyle \rho (\beta ,\sigma ^{2},\gamma )}, whereγ{\displaystyle \gamma } symbolizes the parameters of the distribution forX{\displaystyle \mathbf {X} }. Only under the assumption of (weak) exogeneity can the joint likelihood be factored intoρ(yX,β,σ2)ρ(Xγ){\displaystyle \rho (\mathbf {y} \mid {\boldsymbol {\mathbf {X} }},\beta ,\sigma ^{2})\rho (\mathbf {X} \mid \gamma )}.[2] The latter part is usually ignored under the assumption of disjoint parameter sets. More so, under classic assumptionsX{\displaystyle \mathbf {X} } is considered chosen (for example, in a designed experiment) and therefore has a known probability without parameters.[3]

With conjugate priors

[edit]

Conjugate prior distribution

[edit]

For an arbitrary prior distribution, there may be no analytical solution for theposterior distribution. In this section, we will consider a so-calledconjugate prior for which the posterior distribution can be derived analytically.

A priorρ(β,σ2){\displaystyle \rho ({\boldsymbol {\beta }},\sigma ^{2})} isconjugate to this likelihood function if it has the same functional form with respect toβ{\displaystyle {\boldsymbol {\beta }}} andσ{\displaystyle \sigma }. Since the log-likelihood is quadratic inβ{\displaystyle {\boldsymbol {\beta }}}, the log-likelihood is re-written such that the likelihood becomes normal in(ββ^){\displaystyle ({\boldsymbol {\beta }}-{\hat {\boldsymbol {\beta }}})}. Write

(yXβ)T(yXβ)=[(yXβ^)+(Xβ^Xβ)]T[(yXβ^)+(Xβ^Xβ)]=(yXβ^)T(yXβ^)+(ββ^)T(XTX)(ββ^)+2(Xβ^Xβ)T(yXβ^)= 0=(yXβ^)T(yXβ^)+(ββ^)T(XTX)(ββ^).{\displaystyle {\begin{aligned}(\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})&=[(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})+(\mathbf {X} {\hat {\boldsymbol {\beta }}}-\mathbf {X} {\boldsymbol {\beta }})]^{\mathsf {T}}[(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})+(\mathbf {X} {\hat {\boldsymbol {\beta }}}-\mathbf {X} {\boldsymbol {\beta }})]\\&=(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})+({\boldsymbol {\beta }}-{\hat {\boldsymbol {\beta }}})^{\mathsf {T}}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} )({\boldsymbol {\beta }}-{\hat {\boldsymbol {\beta }}})+\underbrace {2(\mathbf {X} {\hat {\boldsymbol {\beta }}}-\mathbf {X} {\boldsymbol {\beta }})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})} _{=\ 0}\\&=(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})+({\boldsymbol {\beta }}-{\hat {\boldsymbol {\beta }}})^{\mathsf {T}}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} )({\boldsymbol {\beta }}-{\hat {\boldsymbol {\beta }}})\,.\end{aligned}}}

The likelihood is now re-written asρ(y|X,β,σ2)(σ2)v2exp(vs22σ2)(σ2)nv2exp(12σ2(ββ^)T(XTX)(ββ^)),{\displaystyle \rho (\mathbf {y} |\mathbf {X} ,{\boldsymbol {\beta }},\sigma ^{2})\propto (\sigma ^{2})^{-{\frac {v}{2}}}\exp \left(-{\frac {vs^{2}}{2{\sigma }^{2}}}\right)(\sigma ^{2})^{-{\frac {n-v}{2}}}\exp \left(-{\frac {1}{2{\sigma }^{2}}}({\boldsymbol {\beta }}-{\hat {\boldsymbol {\beta }}})^{\mathsf {T}}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} )({\boldsymbol {\beta }}-{\hat {\boldsymbol {\beta }}})\right),}wherevs2=(yXβ^)T(yXβ^) and v=nk,{\displaystyle vs^{2}=(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\hat {\boldsymbol {\beta }}})\quad {\text{ and }}\quad v=n-k,}wherek{\displaystyle k} is the number of regression coefficients.

This suggests a form for the prior:ρ(β,σ2)=ρ(σ2)ρ(βσ2),{\displaystyle \rho ({\boldsymbol {\beta }},\sigma ^{2})=\rho (\sigma ^{2})\rho ({\boldsymbol {\beta }}\mid \sigma ^{2}),}whereρ(σ2){\displaystyle \rho (\sigma ^{2})} is aninverse-gamma distributionρ(σ2)(σ2)v021exp(v0s022σ2).{\displaystyle \rho (\sigma ^{2})\propto (\sigma ^{2})^{-{\frac {v_{0}}{2}}-1}\exp \left(-{\frac {v_{0}s_{0}^{2}}{2\sigma ^{2}}}\right).}

In the notation introduced in theinverse-gamma distribution article, this is the density of anInv-Gamma(a0,b0){\displaystyle {\text{Inv-Gamma}}(a_{0},b_{0})} distribution witha0=v02{\displaystyle a_{0}={\tfrac {v_{0}}{2}}} andb0=12v0s02{\displaystyle b_{0}={\tfrac {1}{2}}v_{0}s_{0}^{2}} withv0{\displaystyle v_{0}} ands02{\displaystyle s_{0}^{2}} as the prior values ofv{\displaystyle v} ands2{\displaystyle s^{2}}, respectively. Equivalently, it can also be described as ascaled inverse chi-squared distribution,Scale-inv-χ2(v0,s02).{\displaystyle {\text{Scale-inv-}}\chi ^{2}(v_{0},s_{0}^{2}).}

Further the conditional prior densityρ(β|σ2){\displaystyle \rho ({\boldsymbol {\beta }}|\sigma ^{2})} is anormal distribution,

ρ(βσ2)(σ2)k/2exp(12σ2(βμ0)TΛ0(βμ0)).{\displaystyle \rho ({\boldsymbol {\beta }}\mid \sigma ^{2})\propto (\sigma ^{2})^{-k/2}\exp \left(-{\frac {1}{2\sigma ^{2}}}({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{0})^{\mathsf {T}}\mathbf {\Lambda } _{0}({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{0})\right).}

In the notation of thenormal distribution, the conditional prior distribution isN(μ0,σ2Λ01).{\displaystyle {\mathcal {N}}\left({\boldsymbol {\mu }}_{0},\sigma ^{2}{\boldsymbol {\Lambda }}_{0}^{-1}\right).}

Posterior distribution

[edit]

With the prior now specified, the posterior distribution can be expressed as

ρ(β,σ2y,X)ρ(yX,β,σ2)ρ(βσ2)ρ(σ2)(σ2)n/2exp(12σ2(yXβ)T(yXβ))(σ2)k/2exp(12σ2(βμ0)TΛ0(βμ0))(σ2)(a0+1)exp(b0σ2){\displaystyle {\begin{aligned}\rho ({\boldsymbol {\beta }},\sigma ^{2}\mid \mathbf {y} ,\mathbf {X} )&\propto \rho (\mathbf {y} \mid \mathbf {X} ,{\boldsymbol {\beta }},\sigma ^{2})\rho ({\boldsymbol {\beta }}\mid \sigma ^{2})\rho (\sigma ^{2})\\&\propto (\sigma ^{2})^{-n/2}\exp \left(-{\frac {1}{2{\sigma }^{2}}}(\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})\right)(\sigma ^{2})^{-k/2}\exp \left(-{\frac {1}{2\sigma ^{2}}}({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{0})^{\mathsf {T}}{\boldsymbol {\Lambda }}_{0}({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{0})\right)(\sigma ^{2})^{-(a_{0}+1)}\exp \left(-{\frac {b_{0}}{\sigma ^{2}}}\right)\end{aligned}}}

With some re-arrangement,[4] the posterior can be re-written so that the posterior meanμn{\displaystyle {\boldsymbol {\mu }}_{n}} of the parameter vectorβ{\displaystyle {\boldsymbol {\beta }}} can be expressed in terms of the least squares estimatorβ^{\displaystyle {\hat {\boldsymbol {\beta }}}} and the prior meanμ0{\displaystyle {\boldsymbol {\mu }}_{0}}, with the strength of the prior indicated by the prior precision matrixΛ0{\displaystyle {\boldsymbol {\Lambda }}_{0}}

μn=(XTX+Λ0)1(XTXβ^+Λ0μ0).{\displaystyle {\boldsymbol {\mu }}_{n}=(\mathbf {X} ^{\mathsf {T}}\mathbf {X} +{\boldsymbol {\Lambda }}_{0})^{-1}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} {\hat {\boldsymbol {\beta }}}+{\boldsymbol {\Lambda }}_{0}{\boldsymbol {\mu }}_{0}).}

To justify thatμn{\displaystyle {\boldsymbol {\mu }}_{n}} is indeed the posterior mean, the quadratic terms in the exponential can be re-arranged as aquadratic form inβμn{\displaystyle {\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{n}}.[5]

(yXβ)T(yXβ)+(βμ0)TΛ0(βμ0)=(βμn)T(XTX+Λ0)(βμn)+yTyμnT(XTX+Λ0)μn+μ0TΛ0μ0.{\displaystyle (\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})^{\mathsf {T}}(\mathbf {y} -\mathbf {X} {\boldsymbol {\beta }})+({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{0})^{\mathsf {T}}{\boldsymbol {\Lambda }}_{0}({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{0})=({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{n})^{\mathsf {T}}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} +{\boldsymbol {\Lambda }}_{0})({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{n})+\mathbf {y} ^{\mathsf {T}}\mathbf {y} -{\boldsymbol {\mu }}_{n}^{\mathsf {T}}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} +{\boldsymbol {\Lambda }}_{0}){\boldsymbol {\mu }}_{n}+{\boldsymbol {\mu }}_{0}^{\mathsf {T}}{\boldsymbol {\Lambda }}_{0}{\boldsymbol {\mu }}_{0}.}

Now the posterior can be expressed as anormal distribution times aninverse-gamma distribution:

ρ(β,σ2y,X)(σ2)k/2exp(12σ2(βμn)T(XTX+Λ0)(βμn))(σ2)n+2a021exp(2b0+yTyμnT(XTX+Λ0)μn+μ0TΛ0μ02σ2).{\displaystyle \rho ({\boldsymbol {\beta }},\sigma ^{2}\mid \mathbf {y} ,\mathbf {X} )\propto (\sigma ^{2})^{-k/2}\exp \left(-{\frac {1}{2{\sigma }^{2}}}({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{n})^{\mathsf {T}}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} +\mathbf {\Lambda } _{0})({\boldsymbol {\beta }}-{\boldsymbol {\mu }}_{n})\right)(\sigma ^{2})^{-{\frac {n+2a_{0}}{2}}-1}\exp \left(-{\frac {2b_{0}+\mathbf {y} ^{\mathsf {T}}\mathbf {y} -{\boldsymbol {\mu }}_{n}^{\mathsf {T}}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} +{\boldsymbol {\Lambda }}_{0}){\boldsymbol {\mu }}_{n}+{\boldsymbol {\mu }}_{0}^{\mathsf {T}}{\boldsymbol {\Lambda }}_{0}{\boldsymbol {\mu }}_{0}}{2\sigma ^{2}}}\right).}

Therefore, the posterior distribution can be parametrized as follows.ρ(β,σ2y,X)ρ(βσ2,y,X)ρ(σ2y,X),{\displaystyle \rho ({\boldsymbol {\beta }},\sigma ^{2}\mid \mathbf {y} ,\mathbf {X} )\propto \rho ({\boldsymbol {\beta }}\mid \sigma ^{2},\mathbf {y} ,\mathbf {X} )\rho (\sigma ^{2}\mid \mathbf {y} ,\mathbf {X} ),}where the two factors correspond to the densities ofN(μn,σ2Λn1){\displaystyle {\mathcal {N}}\left({\boldsymbol {\mu }}_{n},\sigma ^{2}{\boldsymbol {\Lambda }}_{n}^{-1}\right)\,} andInv-Gamma(an,bn){\displaystyle {\text{Inv-Gamma}}\left(a_{n},b_{n}\right)} distributions, with the parameters of these given by

Λn=(XTX+Λ0),μn=(Λn)1(XTXβ^+Λ0μ0),{\displaystyle {\boldsymbol {\Lambda }}_{n}=(\mathbf {X} ^{\mathsf {T}}\mathbf {X} +\mathbf {\Lambda } _{0}),\quad {\boldsymbol {\mu }}_{n}=({\boldsymbol {\Lambda }}_{n})^{-1}(\mathbf {X} ^{\mathsf {T}}\mathbf {X} {\hat {\boldsymbol {\beta }}}+{\boldsymbol {\Lambda }}_{0}{\boldsymbol {\mu }}_{0}),}an=a0+n2,bn=b0+12(yTy+μ0TΛ0μ0μnTΛnμn).{\displaystyle a_{n}=a_{0}+{\frac {n}{2}},\qquad b_{n}=b_{0}+{\frac {1}{2}}(\mathbf {y} ^{\mathsf {T}}\mathbf {y} +{\boldsymbol {\mu }}_{0}^{\mathsf {T}}{\boldsymbol {\Lambda }}_{0}{\boldsymbol {\mu }}_{0}-{\boldsymbol {\mu }}_{n}^{\mathsf {T}}{\boldsymbol {\Lambda }}_{n}{\boldsymbol {\mu }}_{n}).}

which illustrates Bayesian inference being a compromise between the information contained in the prior and the information contained in the sample.

Model evidence

[edit]

Themodel evidencep(ym){\displaystyle p(\mathbf {y} \mid m)} is the probability of the data given the modelm{\displaystyle m}. It is also known as themarginal likelihood, and as theprior predictive density. Here, the model is defined by the likelihood functionp(yX,β,σ){\displaystyle p(\mathbf {y} \mid \mathbf {X} ,{\boldsymbol {\beta }},\sigma )} and the prior distribution on the parameters, i.e.p(β,σ){\displaystyle p({\boldsymbol {\beta }},\sigma )}. The model evidence captures in a single number how well such a model explains the observations. The model evidence of the Bayesian linear regression model presented in this section can be used to compare competing linear models byBayes factors. These models may differ in the number and values of the predictor variables as well as in their priors on the model parameters. Model complexity is already taken into account by the model evidence, because it marginalizes out the parameters by integratingp(y,β,σX){\displaystyle p(\mathbf {y} ,{\boldsymbol {\beta }},\sigma \mid \mathbf {X} )} over all possible values ofβ{\displaystyle {\boldsymbol {\beta }}} andσ{\displaystyle \sigma }.p(y|m)=p(yX,β,σ)p(β,σ)dβdσ{\displaystyle p(\mathbf {y} |m)=\int p(\mathbf {y} \mid \mathbf {X} ,{\boldsymbol {\beta }},\sigma )\,p({\boldsymbol {\beta }},\sigma )\,d{\boldsymbol {\beta }}\,d\sigma }This integral can be computed analytically and the solution is given in the following equation.[6]p(ym)=1(2π)n/2det(Λ0)det(Λn)b0a0bnanΓ(an)Γ(a0){\displaystyle p(\mathbf {y} \mid m)={\frac {1}{(2\pi )^{n/2}}}{\sqrt {\frac {\det({\boldsymbol {\Lambda }}_{0})}{\det({\boldsymbol {\Lambda }}_{n})}}}\cdot {\frac {b_{0}^{a_{0}}}{b_{n}^{a_{n}}}}\cdot {\frac {\Gamma (a_{n})}{\Gamma (a_{0})}}}

HereΓ{\displaystyle \Gamma } denotes thegamma function. Because we have chosen a conjugate prior, the marginal likelihood can also be easily computed by evaluating the following equality for arbitrary values ofβ{\displaystyle {\boldsymbol {\beta }}} andσ{\displaystyle \sigma }.[7]p(ym)=p(β,σ|m)p(yX,β,σ,m)p(β,σy,X,m){\displaystyle p(\mathbf {y} \mid m)={\frac {p({\boldsymbol {\beta }},\sigma |m)\,p(\mathbf {y} \mid \mathbf {X} ,{\boldsymbol {\beta }},\sigma ,m)}{p({\boldsymbol {\beta }},\sigma \mid \mathbf {y} ,\mathbf {X} ,m)}}}Note that this equation follows from a re-arrangement ofBayes' theorem. Inserting the formulas for the prior, the likelihood, and the posterior and simplifying the resulting expression leads to the analytic expression given above.

Other cases

[edit]

In general, it may be impossible or impractical to derive the posterior distribution analytically. However, it is possible to approximate the posterior by anapproximate Bayesian inference method such asMonte Carlo sampling,[8]INLA orvariational Bayes.

The special caseμ0=0,Λ0=cI{\displaystyle {\boldsymbol {\mu }}_{0}=0,\mathbf {\Lambda } _{0}=c\mathbf {I} } is calledridge regression.

A similar analysis can be performed for the general case of the multivariate regression and part of this provides for Bayesianestimation of covariance matrices: seeBayesian multivariate linear regression.

See also

[edit]
This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(August 2011) (Learn how and when to remove this message)

Notes

[edit]
  1. ^Huang, Yunfei; Gompper, Gerhard; Sabass, Benedikt (2020). "A Bayesian traction force microscopy method with automated denoising in a user-friendly software package".Computer Physics Communications.256 107313.arXiv:2005.01377.Bibcode:2020CoPhC.25607313H.doi:10.1016/j.cpc.2020.107313.
  2. ^See Jackman (2009), p. 101.
  3. ^See Gelman et al. (2013), p. 354.
  4. ^The intermediate steps of this computation can be found in O'Hagan (1994) at the beginning of the chapter on Linear models.
  5. ^The intermediate steps are in Fahrmeir et al. (2009) on page 188.
  6. ^The intermediate steps of this computation can be found in O'Hagan (1994) on page 257.
  7. ^Chib, Siddhartha (1995). "Marginal Likelihood from the Gibbs Output".Journal of the American Statistical Association.90 (432):1313–1321.doi:10.2307/2291521.
  8. ^Carlin and Louis (2008) and Gelman, et al. (2003) explain how to use sampling methods for Bayesian linear regression.

References

[edit]
  • Box, G. E. P.;Tiao, G. C. (1973).Bayesian Inference in Statistical Analysis. Wiley.ISBN 0-471-57428-7.
  • Carlin, Bradley P.; Louis, Thomas A. (2008).Bayesian Methods for Data Analysis (Third ed.). Boca Raton, FL: Chapman and Hall/CRC.ISBN 978-1-58488-697-6.
  • Fahrmeir, L.; Kneib, T.; Lang, S. (2009).Regression. Modelle, Methoden und Anwendungen (Second ed.). Heidelberg: Springer.doi:10.1007/978-3-642-01837-4.ISBN 978-3-642-01836-7.
  • Gelman, Andrew; et al. (2013). "Introduction to regression models".Bayesian Data Analysis (Third ed.). Boca Raton, FL: Chapman and Hall/CRC. pp. 353–380.ISBN 978-1-4398-4095-5.
  • Jackman, Simon (2009). "Regression models".Bayesian Analysis for the Social Sciences. Wiley. pp. 99–124.ISBN 978-0-470-01154-6.
  • Rossi, Peter E.; Allenby, Greg M.; McCulloch, Robert (2006).Bayesian Statistics and Marketing. John Wiley & Sons.ISBN 0-470-86367-6.
  • O'Hagan, Anthony (1994).Bayesian Inference. Kendall's Advanced Theory of Statistics. Vol. 2B (First ed.). Halsted.ISBN 0-340-52922-9.

External links

[edit]
Computational statistics
Correlation and dependence
Regression analysis
Regression as a
statistical model
Linear regression
Predictor structure
Non-standard
Non-normal errors
Decomposition of variance
Model exploration
Background
Design of experiments
Numericalapproximation
Applications
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis (see alsoTemplate:Least squares and regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Retrieved from "https://en.wikipedia.org/w/index.php?title=Bayesian_linear_regression&oldid=1314188620"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp