Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:Efficient Leave-One-Out Cross-Validation and WAIC for BayesianModels
Version:2.8.0
Date:2024-07-03
Maintainer:Jonah Gabry <jsg2201@columbia.edu>
URL:https://mc-stan.org/loo/,https://discourse.mc-stan.org
BugReports:https://github.com/stan-dev/loo/issues
Description:Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.
License:GPL (≥ 3)
LazyData:TRUE
Depends:R (≥ 3.1.2)
Imports:checkmate, matrixStats (≥ 0.52), parallel, posterior (≥1.5.0), stats
Suggests:bayesplot (≥ 1.7.0), brms (≥ 2.10.0), ggplot2, graphics,knitr, rmarkdown, rstan, rstanarm (≥ 2.19.0), rstantools,spdep, testthat (≥ 2.1.0)
VignetteBuilder:knitr
Encoding:UTF-8
SystemRequirements:pandoc (>= 1.12.3), pandoc-citeproc
RoxygenNote:7.3.1
NeedsCompilation:no
Packaged:2024-07-03 16:26:26 UTC; jgabry
Author:Aki Vehtari [aut], Jonah Gabry [cre, aut], Måns Magnusson [aut], Yuling Yao [aut], Paul-Christian Bürkner [aut], Topi Paananen [aut], Andrew Gelman [aut], Ben Goodrich [ctb], Juho Piironen [ctb], Bruno Nicenboim [ctb], Leevi Lindgren [ctb]
Repository:CRAN
Date/Publication:2024-07-03 18:50:02 UTC

Efficient LOO-CV and WAIC for Bayesian models

Description

mc-stan.orgStan Development Team

This package implements the methods described in Vehtari, Gelman, andGabry (2017), Vehtari, Simpson, Gelman, Yao, and Gabry (2024), andYao et al. (2018). To get started see theloo packagevignettes, theloo() function for efficient approximate leave-one-outcross-validation (LOO-CV), thepsis() function for the Paretosmoothed importance sampling (PSIS) algorithm, orloo_model_weights() for an implementation of Bayesian stacking ofpredictive distributions from multiple models.

Details

Leave-one-out cross-validation (LOO-CV) and the widely applicableinformation criterion (WAIC) are methods for estimating pointwiseout-of-sample prediction accuracy from a fitted Bayesian model using thelog-likelihood evaluated at the posterior simulations of the parametervalues. LOO-CV and WAIC have various advantages over simpler estimates ofpredictive error such as AIC and DIC but are less used in practice becausethey involve additional computational steps. This package implements thefast and stable computations for approximate LOO-CV laid out in Vehtari,Gelman, and Gabry (2017). From existing posterior simulation draws, wecompute LOO-CV using Pareto smoothed importance sampling (PSIS; Vehtari,Simpson, Gelman, Yao, and Gabry, 2024), a new procedure for stabilizingand diagnosing importance weights. As a byproduct of our calculations,we also obtain approximate standard errors for estimated predictiveerrors and for comparing of predictive errors between two models.

We recommend PSIS-LOO-CV instead of WAIC, because PSIS provides usefuldiagnostics and effective sample size and Monte Carlo standard errorestimates.

Author(s)

Maintainer: Jonah Gabryjsg2201@columbia.edu

Authors:

Other contributors:

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018) Usingstacking to average Bayesian predictive distributions.Bayesian Analysis, advance publication, doi:10.1214/17-BA1091.(online).

Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2019).Leave-One-Out Cross-Validation for Large Data.InThirty-sixth International Conference on Machine Learning,PMLR 97:4244-4253.

Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2020).Leave-One-Out Cross-Validation for Model Comparison in Large Data.InProceedings of the 23rd International Conference on ArtificialIntelligence and Statistics (AISTATS), PMLR 108:341-351.

Epifani, I., MacEachern, S. N., and Peruggia, M. (2008). Case-deletionimportance sampling estimators: Central limit theorems and related results.Electronic Journal of Statistics2, 774-806.

Gelfand, A. E. (1996). Model determination using sampling-based methods. InMarkov Chain Monte Carlo in Practice, ed. W. R. Gilks, S. Richardson,D. J. Spiegelhalter, 145-162. London: Chapman and Hall.

Gelfand, A. E., Dey, D. K., and Chang, H. (1992). Model determination usingpredictive distributions with implementation via sampling-based methods. InBayesian Statistics 4, ed. J. M. Bernardo, J. O. Berger, A. P. Dawid,and A. F. M. Smith, 147-167. Oxford University Press.

Gelman, A., Hwang, J., and Vehtari, A. (2014). Understanding predictiveinformation criteria for Bayesian models.Statistics and Computing24, 997-1016.

Ionides, E. L. (2008). Truncated importance sampling.Journal ofComputational and Graphical Statistics17, 295-311.

Koopman, S. J., Shephard, N., and Creal, D. (2009). Testing the assumptionsbehind importance sampling.Journal of Econometrics149, 2-11.

Peruggia, M. (1997). On the variability of case-deletion importance samplingweights in the Bayesian linear model.Journal of the AmericanStatistical Association92, 199-207.

Stan Development Team (2017). The Stan C++ Library, Version 2.17.0.https://mc-stan.org.

Stan Development Team (2018). RStan: the R interface to Stan, Version 2.17.3.https://mc-stan.org.

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation andwidely application information criterion in singular learning theory.Journal of Machine Learning Research11, 3571-3594.

Zhang, J., and Stephens, M. A. (2009). A new and efficient estimation methodfor the generalized Pareto distribution.Technometrics51,316-325.

See Also

Useful links:


Compute a point estimate from a draws object

Description

Compute a point estimate from a draws object

Usage

.compute_point_estimate(draws)## S3 method for class 'matrix'.compute_point_estimate(draws)## Default S3 method:.compute_point_estimate(draws)

Arguments

draws

A draws object with draws from the posterior.

Details

This is a generic function to compute point estimates from drawsobjects. The function is internal and should only be used by developers toenableloo_subsample() for arbitrary draws objects.

Value

A 1 by P matrix with point estimates from a draws object.


The number of posterior draws in a draws object.

Description

The number of posterior draws in a draws object.

Usage

.ndraws(x)## S3 method for class 'matrix'.ndraws(x)## Default S3 method:.ndraws(x)

Arguments

x

A draws object with posterior draws.

Details

This is a generic function to return the total number of draws froman arbitrary draws objects. The function is internal and should only beused by developers to enableloo_subsample() for arbitrary draws objects.

Value

An integer with the number of draws.


Thin a draws object

Description

Thin a draws object

Usage

.thin_draws(draws, loo_approximation_draws)## S3 method for class 'matrix'.thin_draws(draws, loo_approximation_draws)## S3 method for class 'numeric'.thin_draws(draws, loo_approximation_draws)## Default S3 method:.thin_draws(draws, loo_approximation_draws)

Arguments

draws

A draws object with posterior draws.

loo_approximation_draws

The number of posterior draws to return (ie after thinning).

Details

This is a generic function to thin draws from arbitrary drawsobjects. The function is internal and should only be used by developers toenableloo_subsample() for arbitrary draws objects.

Value

A thinned draws object.


Compute weighted expectations

Description

TheE_loo() function computes weighted expectations (means, variances,quantiles) using the importance weights obtained from thePSIS smoothing procedure. The expectations estimated by theE_loo() function assume that the PSIS approximation is working well.A smallPareto k estimate is necessary,but not sufficient, forE_loo() to give reliable estimates. Additionaldiagnostic checks for gauging the reliability of the estimates are indevelopment and will be added in a future release.

Usage

E_loo(x, psis_object, ...)## Default S3 method:E_loo(  x,  psis_object,  ...,  type = c("mean", "variance", "sd", "quantile"),  probs = NULL,  log_ratios = NULL)## S3 method for class 'matrix'E_loo(  x,  psis_object,  ...,  type = c("mean", "variance", "sd", "quantile"),  probs = NULL,  log_ratios = NULL)

Arguments

x

A numeric vector or matrix.

psis_object

An object returned bypsis().

...

Arguments passed to individual methods.

type

The type of expectation to compute. The options are"mean","variance","sd", and"quantile".

probs

For computing quantiles, a vector of probabilities.

log_ratios

Optionally, a vector or matrix (the same dimensions asx)of raw (not smoothed) log ratios. If working with log-likelihood values,the log ratios are thenegative of those values. Iflog_ratios isspecified we are able to compute more accuratePareto kdiagnostics specific toE_loo().

Value

A named list with the following components:

value

The result of the computation.

For the matrix method,value is a vector withncol(x)elements, with one exception: whentype="quantile" andmultiple values are specified inprobs thevalue component ofthe returned object is alength(probs) byncol(x) matrix.

For the default/vector method thevalue component is scalar, withone exception: whentype="quantile" and multiple valuesare specified inprobs thevalue component is a vector withlength(probs) elements.

pareto_k

Function-specific diagnostic.

For the matrix method it will be a vector of lengthncol(x)containing estimates of the shape parameterk of thegeneralized Pareto distribution. For the default/vector method,the estimate is a scalar. Iflog_ratios is not specified whencallingE_loo(), the smoothed log-weights are used to estimatePareto-k's, which may produce optimistic estimates.

Fortype="mean",type="var", andtype="sd", the returned Pareto-k isusually the maximum of the Pareto-k's for the left and right tail ofhrand the right tail ofr, wherer is the importance ratio andh=x fortype="mean" andh=x^2 fortype="var" andtype="sd".Ifh is binary, constant, or not finite, or iftype="quantile", thereturned Pareto-k is the Pareto-k for the right tail ofr.

Examples

if (requireNamespace("rstanarm", quietly = TRUE)) {# Use rstanarm package to quickly fit a model and get both a log-likelihood# matrix and draws from the posterior predictive distributionlibrary("rstanarm")# data from help("lm")ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)d <- data.frame(  weight = c(ctl, trt),  group = gl(2, 10, 20, labels = c("Ctl","Trt")))fit <- stan_glm(weight ~ group, data = d, refresh = 0)yrep <- posterior_predict(fit)dim(yrep)log_ratios <- -1 * log_lik(fit)dim(log_ratios)r_eff <- relative_eff(exp(-log_ratios), chain_id = rep(1:4, each = 1000))psis_object <- psis(log_ratios, r_eff = r_eff, cores = 2)E_loo(yrep, psis_object, type = "mean")E_loo(yrep, psis_object, type = "var")E_loo(yrep, psis_object, type = "sd")E_loo(yrep, psis_object, type = "quantile", probs = 0.5) # medianE_loo(yrep, psis_object, type = "quantile", probs = c(0.1, 0.9))# We can get more accurate Pareto k diagnostic if we also provide# the log_ratios argumentE_loo(yrep, psis_object, type = "mean", log_ratios = log_ratios)}

Pareto smoothed importance sampling (PSIS)using approximate posteriors

Description

Pareto smoothed importance sampling (PSIS)using approximate posteriors

Usage

ap_psis(log_ratios, log_p, log_g, ...)## S3 method for class 'array'ap_psis(log_ratios, log_p, log_g, ..., cores = getOption("mc.cores", 1))## S3 method for class 'matrix'ap_psis(log_ratios, log_p, log_g, ..., cores = getOption("mc.cores", 1))## Default S3 method:ap_psis(log_ratios, log_p, log_g, ...)

Arguments

log_ratios

The log-likelihood ratios (ie -log_liks)

log_p

The log-posterior (target) evaluated at S samples from theproposal distribution (g). A vector of length S.

log_g

The log-density (proposal) evaluated at S samples from theproposal distribution (g). A vector of length S.

...

Currently not in use.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

Methods (by class)


Model comparison (deprecated, old version)

Description

This function is deprecated. Please use the newloo_compare() functioninstead.

Usage

compare(..., x = list())

Arguments

...

At least two objects returned byloo() (orwaic()).

x

A list of at least two objects returned byloo() (orwaic()). This argument can be used as an alternative tospecifying the objects in....

Details

When comparing two fitted models, we can estimate the difference in theirexpected predictive accuracy by the difference inelpd_loo orelpd_waic (or multiplied by -2, if desired, to be on thedeviance scale).

When that difference,elpd_diff, is positive then the expectedpredictive accuracy for the second model is higher. A negativeelpd_diff favors the first model.

When usingcompare() with more than two models, the values in theelpd_diff andse_diff columns of the returned matrix arecomputed by making pairwise comparisons between each model and the modelwith the best ELPD (i.e., the model in the first row).Although theelpd_diff column is equal to the difference inelpd_loo, do not expect these_diff column to be equal to thethe difference inse_elpd_loo.

To compute the standard error of the difference in ELPD we use apaired estimate to take advantage of the fact that the same set ofNdata points was used to fit both models. These calculations should be mostuseful whenN is large, because then non-normality of thedistribution is not such an issue when estimating the uncertainty in thesesums. These standard errors, for all their flaws, should give a bettersense of uncertainty than what is obtained using the current standardapproach of comparing differences of deviances to a Chi-squareddistribution, a practice derived for Gaussian linear models orasymptotically, and which only applies to nested models in any case.

Value

A vector or matrix with class'compare.loo' that has its ownprint method. If exactly two objects are provided in... orx, then the difference in expected predictive accuracy and thestandard error of the difference are returned. If more than two objects areprovided then a matrix of summary information is returned (seeDetails).

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

Examples

## Not run: loo1 <- loo(log_lik1)loo2 <- loo(log_lik2)print(compare(loo1, loo2), digits = 3)print(compare(x = list(loo1, loo2)))waic1 <- waic(log_lik1)waic2 <- waic(log_lik2)compare(waic1, waic2)## End(Not run)

Continuously ranked probability score

Description

Thecrps() andscrps() functions and their⁠loo_*()⁠ counterparts can beused to compute the continuously ranked probability score (CRPS) and scaledCRPS (SCRPS) (see Bolin and Wallin, 2022). CRPS is a proper scoring rule, andstrictly proper when the first moment of the predictive distribution isfinite. Both can be expressed in terms of samples form the predictivedistribution. See e.g. Gneiting and Raftery (2007) for a comprehensivediscussion on CRPS.

Usage

crps(x, ...)scrps(x, ...)loo_crps(x, ...)loo_scrps(x, ...)## S3 method for class 'matrix'crps(x, x2, y, ..., permutations = 1)## S3 method for class 'numeric'crps(x, x2, y, ..., permutations = 1)## S3 method for class 'matrix'loo_crps(  x,  x2,  y,  log_lik,  ...,  permutations = 1,  r_eff = 1,  cores = getOption("mc.cores", 1))## S3 method for class 'matrix'scrps(x, x2, y, ..., permutations = 1)## S3 method for class 'numeric'scrps(x, x2, y, ..., permutations = 1)## S3 method for class 'matrix'loo_scrps(  x,  x2,  y,  log_lik,  ...,  permutations = 1,  r_eff = 1,  cores = getOption("mc.cores", 1))

Arguments

x

AS byN matrix (draws by observations), or a vector of lengthS when only single observation is provided iny.

...

Passed on toE_loo() in the⁠loo_*()⁠ version of thesefunctions.

x2

Independent draws from the same distribution as draws inx.Should be of the identical dimension.

y

A vector of observations or a single value.

permutations

An integer, with default value of 1, specifying how manytimes the expected value of |X - X'| (⁠|x - x2|⁠) is computed. The roworder ofx2 is shuffled as elementsx andx2 are typically drawngiven the same values of parameters. This happens, e.g., when one callsposterior_predict() twice for a fittedrstanarm orbrmsmodel. Generating more permutations is expected to decrease the variance ofthe computed expected value.

log_lik

A log-likelihood matrix the same size asx.

r_eff

An optional vector of relative effective sample size estimatescontaining one element per observation. Seepsis() for details.

cores

The number of cores to use for parallelization of⁠[psis()]⁠.Seepsis() for details.

Details

To compute (S)CRPS, the user needs to provide two sets of draws,x andx2, from the predictive distribution. This is due to the fact that formulasused to compute CRPS involve an expectation of the absolute difference ofxandx2, both having the same distribution. See thepermutations argument,as well as Gneiting and Raftery (2007) for details.

Value

A list containing two elements:estimates andpointwise.The former reports estimator and standard error and latter the pointwisevalues.

References

Bolin, D., & Wallin, J. (2022). Local scale invariance and robustness ofproper scoring rules. arXiv.doi:10.48550/arXiv.1912.05642

Gneiting, T., & Raftery, A. E. (2007). Strictly Proper Scoring Rules,Prediction, and Estimation. Journal of the American Statistical Association,102(477), 359–378.

Examples

## Not run: # An example using rstanarmlibrary(rstanarm)data("kidiq")fit <- stan_glm(kid_score ~ mom_hs + mom_iq, data = kidiq)ypred1 <- posterior_predict(fit)ypred2 <- posterior_predict(fit)crps(ypred1, ypred2, y = fit$y)loo_crps(ypred1, ypred2, y = fit$y, log_lik = log_lik(fit))## End(Not run)

Generic (expected) log-predictive density

Description

Theelpd() methods for arrays and matrices can compute the expected logpointwise predictive density for a new dataset or the log pointwisepredictive density of the observed data (an overestimate of the elpd).

Usage

elpd(x, ...)## S3 method for class 'array'elpd(x, ...)## S3 method for class 'matrix'elpd(x, ...)

Arguments

x

A log-likelihood array or matrix. TheMethods (by class)section, below, has detailed descriptions of how to specify the inputs foreach method.

...

Currently ignored.

Details

Theelpd() function is an S3 generic and methods are provided for3-D pointwise log-likelihood arrays and matrices.

Methods (by class)

See Also

The vignetteHoldout validation and K-fold cross-validation of Stanprograms with the loo package for demonstrations of using theelpd()methods.

Examples

# Calculate the lpd of the observed dataLLarr <- example_loglik_array()elpd(LLarr)

Objects to use in examples and tests

Description

Example pointwise log-likelihood objects to use in demonstrations and tests.See theValue andExamples sections below.

Usage

example_loglik_array()example_loglik_matrix()

Value

example_loglik_array() returns a 500 (draws) x 2 (chains) x 32(observations) pointwise log-likelihood array.

example_loglik_matrix() returns the same pointwise log-likelihood valuesasexample_loglik_array() but reshaped into a 1000 (draws*chains) x 32(observations) matrix.

Examples

LLarr <- example_loglik_array()(dim_arr <- dim(LLarr))LLmat <- example_loglik_matrix()(dim_mat <- dim(LLmat))all.equal(dim_mat[1], dim_arr[1] * dim_arr[2])all.equal(dim_mat[2], dim_arr[3])all.equal(LLarr[, 1, ], LLmat[1:500, ])all.equal(LLarr[, 2, ], LLmat[501:1000, ])

Extract pointwise log-likelihood from a Stan model

Description

Convenience function for extracting the pointwise log-likelihoodmatrix or array from astanfit object from therstan package.Note: recent versions ofrstan now include aloo() method forstanfit objects that handles this internally.

Usage

extract_log_lik(stanfit, parameter_name = "log_lik", merge_chains = TRUE)

Arguments

stanfit

Astanfit object (rstan package).

parameter_name

A character string naming the parameter (or generatedquantity) in the Stan model corresponding to the log-likelihood.

merge_chains

IfTRUE (the default), all Markov chains aremerged together (i.e., stacked) and a matrix is returned. IfFALSEthey are kept separate and an array is returned.

Details

Stan does not automatically compute and store the log-likelihood. Itis up to the user to incorporate it into the Stan program if it is to beextracted after fitting the model. In a Stan model, the pointwise loglikelihood can be coded as a vector in the transformed parameters block(and then summed up in the model block) or it can be coded entirely in thegenerated quantities block. We recommend using the generated quantitiesblock so that the computations are carried out only once per iterationrather than once per HMC leapfrog step.

For example, the following is the⁠generated quantities⁠ block forcomputing and saving the log-likelihood for a linear regression model withN data points, outcomey, predictor matrixX,coefficientsbeta, and standard deviationsigma:

⁠vector[N] log_lik;⁠

for (n in 1:N) log_lik[n] = normal_lpdf(y[n] | X[n, ] * beta, sigma);

Value

Ifmerge_chains=TRUE, anS byN matrix of(post-warmup) extracted draws, whereS is the size of the posteriorsample andN is the number of data points. Ifmerge_chains=FALSE, anI byC byN array, whereI \times C = S.

References

Stan Development Team (2017). The Stan C++ Library, Version 2.16.0.https://mc-stan.org/

Stan Development Team (2017). RStan: the R interface to Stan, Version 2.16.1.https://mc-stan.org/


Find the model names associated with"loo" objects

Description

Find the model names associated with"loo" objects

Usage

find_model_names(x)

Arguments

x

List of"loo" objects.

Value

Character vector of model names the same length asx.


Estimate parameters of the Generalized Pareto distribution

Description

Given a samplex, Estimate the parametersk and\sigma ofthe generalized Pareto distribution (GPD), assuming the location parameter is0. By default the fit uses a prior fork, which will stabilizeestimates for very small sample sizes (and low effective sample sizes in thecase of MCMC samples). The weakly informative prior is a Gaussian priorcentered at 0.5.

Usage

gpdfit(x, wip = TRUE, min_grid_pts = 30, sort_x = TRUE)

Arguments

x

A numeric vector. The sample from which to estimate the parameters.

wip

Logical indicating whether to adjustk based on a weaklyinformative Gaussian prior centered on 0.5. Defaults toTRUE.

min_grid_pts

The minimum number of grid points used in the fittingalgorithm. The actual number used ismin_grid_pts + floor(sqrt(length(x))).

sort_x

IfTRUE (the default), the first step in the fittingalgorithm is to sort the elements ofx. Ifx is alreadysorted in ascending order thensort_x can be set toFALSE toskip the initial sorting step.

Details

Here the parameterk is the negative ofk in Zhang &Stephens (2009).

Value

A named list with componentsk andsigma.

References

Zhang, J., and Stephens, M. A. (2009). A new and efficient estimation methodfor the generalized Pareto distribution.Technometrics51, 316-325.

See Also

psis(),pareto-k-diagnostic


A parent class for different importance sampling methods.

Description

A parent class for different importance sampling methods.

Usage

importance_sampling(log_ratios, method, ...)## S3 method for class 'array'importance_sampling(  log_ratios,  method,  ...,  r_eff = 1,  cores = getOption("mc.cores", 1))## S3 method for class 'matrix'importance_sampling(  log_ratios,  method,  ...,  r_eff = 1,  cores = getOption("mc.cores", 1))## Default S3 method:importance_sampling(log_ratios, method, ..., r_eff = 1)

Arguments

log_ratios

An array, matrix, or vector of importance ratios on the logscale (for PSIS-LOO these arenegative log-likelihood values). See theMethods (by class) section below for a detailed description of howto specify the inputs for each method.

method

The importance sampling method to use. The following methodsare implemented:

  • "psis": Pareto-Smoothed Importance Sampling (PSIS). Default method.

  • "tis": Truncated Importance Sampling (TIS) with truncation atsqrt(S), whereS is the number of posterior draws.

  • "sis": Standard Importance Sampling (SIS).

...

Arguments passed on to the various methods.

r_eff

Vector of relative effective sample size estimates containingone element per observation. The values provided should be the relativeeffective sample sizes of1/exp(log_ratios) (i.e.,1/ratios).This is related to the relative efficiency of estimating the normalizingterm in self-normalizing importance sampling. Ifr_eff is notprovided then the reported PSIS effective sample sizes and Monte Carloerror estimates can be over-optimistic. If the posterior draws are (near)independent thenr_eff=1 can be used.r_eff has to be a scalar (samevalue is used for all observations) or a vector with length equal to thenumber of observations. The default value is 1. See therelative_eff()helper function for computingr_eff.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).


Generic function for K-fold cross-validation for developers

Description

For developers of Bayesian modeling packages,loo includesa generic functionkfold() so that methods may be defined for K-foldCV without name conflicts between packages. See, for example, thekfold() methods in therstanarm andbrms packages.

TheValue section below describes the objects thatkfold()methods should return in order to be compatible withloo_compare() and theloo package print methods.

Usage

kfold(x, ...)is.kfold(x)

Arguments

x

A fitted model object.

...

Arguments to pass to specific methods.

Value

For developers defining akfold() method for a class"foo", thekfold.foo() function should return a list with classc("kfold", "loo") with at least the following named elements:

It is important for the object to have at least these classes andcomponents so that it is compatible with other functions likeloo_compare() andprint() methods.


Helper functions for K-fold cross-validation

Description

These functions can be used to generate indexes for use withK-fold cross-validation. See theDetails section for explanations.

Usage

kfold_split_random(K = 10, N = NULL)kfold_split_stratified(K = 10, x = NULL)kfold_split_grouped(K = 10, x = NULL)

Arguments

K

The number of folds to use.

N

The number of observations in the data.

x

A discrete variable of lengthN with at leastK levels (uniquevalues). Will be coerced to afactor.

Details

kfold_split_random() splits the data intoK groupsof equal size (or roughly equal size).

For a categorical variablexkfold_split_stratified()splits the observations intoK groups ensuring that relativecategory frequencies are approximately preserved.

For a grouping variablex,kfold_split_grouped() placesall observations inx from the same group/level together inthe same fold. The selection of which groups/levels go into whichfold (relevant when when there are more groups than folds) israndomized.

Value

An integer vector of lengthN where each element is an index in1:K.

Examples

ids <- kfold_split_random(K = 5, N = 20)print(ids)table(ids)x <- sample(c(0, 1), size = 200, replace = TRUE, prob = c(0.05, 0.95))table(x)ids <- kfold_split_stratified(K = 5, x = x)print(ids)table(ids, x)grp <- gl(n = 50, k = 15, labels = state.name)length(grp)head(table(grp))ids_10 <- kfold_split_grouped(K = 10, x = grp)(tab_10 <- table(grp, ids_10))colSums(tab_10)ids_9 <- kfold_split_grouped(K = 9, x = grp)(tab_9 <- table(grp, ids_9))colSums(tab_9)

Efficient approximate leave-one-out cross-validation (LOO)

Description

Theloo() methods for arrays, matrices, and functions compute PSIS-LOOCV, efficient approximate leave-one-out (LOO) cross-validation for Bayesianmodels using Pareto smoothed importance sampling (PSIS). This isan implementation of the methods described in Vehtari, Gelman, and Gabry(2017) and Vehtari, Simpson, Gelman, Yao, and Gabry (2024).

Theloo_i() function enables testing log-likelihoodfunctions for use with theloo.function() method.

Usage

loo(x, ...)## S3 method for class 'array'loo(  x,  ...,  r_eff = 1,  save_psis = FALSE,  cores = getOption("mc.cores", 1),  is_method = c("psis", "tis", "sis"))## S3 method for class 'matrix'loo(  x,  ...,  r_eff = 1,  save_psis = FALSE,  cores = getOption("mc.cores", 1),  is_method = c("psis", "tis", "sis"))## S3 method for class ''function''loo(  x,  ...,  data = NULL,  draws = NULL,  r_eff = 1,  save_psis = FALSE,  cores = getOption("mc.cores", 1),  is_method = c("psis", "tis", "sis"))loo_i(i, llfun, ..., data = NULL, draws = NULL, r_eff = 1, is_method = "psis")is.loo(x)is.psis_loo(x)

Arguments

x

A log-likelihood array, matrix, or function. TheMethods (by class)section, below, has detailed descriptions of how to specify the inputs foreach method.

r_eff

Vector of relative effective sample size estimates for thelikelihood (exp(log_lik)) of each observation. This is related tothe relative efficiency of estimating the normalizing term inself-normalized importance sampling when using posterior draws obtainedwith MCMC. If MCMC draws are used andr_eff is not provided thenthe reported PSIS effective sample sizes and Monte Carlo error estimatescan be over-optimistic. If the posterior draws are (near) independent thenr_eff=1 can be used.r_eff has to be a scalar (same value is usedfor all observations) or a vector with length equal to the number ofobservations. The default value is 1. See therelative_eff() helperfunctions for help computingr_eff.

save_psis

Should thepsis object created internally byloo() besaved in the returned object? Theloo() function callspsis()internally but by default discards the (potentially large)psis objectafter using it to compute the LOO-CV summaries. Settingsave_psis=TRUEwill add apsis_object component to the list returned byloo.This is useful if you plan to use theE_loo() function to computeweighted expectations after runningloo. Several functions in thebayesplot package also acceptpsis objects.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

is_method

The importance sampling method to use. The following methodsare implemented:

  • "psis": Pareto-Smoothed Importance Sampling (PSIS). Default method.

  • "tis": Truncated Importance Sampling (TIS) with truncation atsqrt(S), whereS is the number of posterior draws.

  • "sis": Standard Importance Sampling (SIS).

data,draws,...

For theloo.function() method and theloo_i()function, these are the data, posterior draws, and other arguments to passto the log-likelihood function. See theMethods (by class) sectionbelow for details on how to specify these arguments.

i

Forloo_i(), an integer in1:N.

llfun

Forloo_i(), the same asx for theloo.function() method. A log-likelihood function as described in theMethods (by class) section.

Details

Theloo() function is an S3 generic and methods are provided for3-D pointwise log-likelihood arrays, pointwise log-likelihood matrices, andlog-likelihood functions. The array and matrix methods are the mostconvenient, but for models fit to very large datasets theloo.function()method is more memory efficient and may be preferable.

Value

Theloo() methods return a named list with classc("psis_loo", "loo") and components:

estimates

A matrix with two columns (Estimate,SE) and three rows (elpd_loo,p_loo,looic). This contains point estimates and standard errors of theexpected log pointwise predictive density (elpd_loo), theeffective number of parameters (p_loo) and the LOOinformation criterionlooic (which is just-2 * elpd_loo, i.e.,converted to deviance scale).

pointwise

A matrix with five columns (and number of rows equal to the number ofobservations) containing the pointwise contributions of the measures(elpd_loo,mcse_elpd_loo,p_loo,looic,influence_pareto_k).in addition to the three measures inestimates, we also reportpointwise values of the Monte Carlo standard error ofelpd_loo(mcse_elpd_loo), and statistics describing the influence ofeach observation on the posterior distribution (influence_pareto_k).These are the estimates of the shape parameterk of thegeneralized Pareto fit to the importance ratios for each leave-one-outdistribution (see thepareto-k-diagnostic page for details).

diagnostics

A named list containing two vectors:

  • pareto_k: Importance sampling reliability diagnostics. By default,these are equal to theinfluence_pareto_k inpointwise.Some algorithms can improve importance sampling reliability andmodify these diagnostics. See thepareto-k-diagnostic page for details.

  • n_eff: PSIS effective sample size estimates.

psis_object

This component will beNULL unless thesave_psis argument is set toTRUE when callingloo(). In that casepsis_object will be the objectof class"psis" that is created when theloo() function callspsis()internally to do the PSIS procedure.

Theloo_i() function returns a named list with componentspointwise anddiagnostics. These components have the samestructure as thepointwise anddiagnostics components of theobject returned byloo() except they contain results for only a singleobservation.

Methods (by class)

Definingloo() methods in a package

Package developers can defineloo() methods for fitted models objects. See the exampleloo.stanfit()method in theExamples section below for an example of defining amethod that callsloo.array(). Theloo.stanreg() method in therstanarm package is an example of defining a method that callsloo.function().

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

See Also

Examples

### Array and matrix methods (using example objects included with loo package)# Array methodLLarr <- example_loglik_array()rel_n_eff <- relative_eff(exp(LLarr))loo(LLarr, r_eff = rel_n_eff, cores = 2)# Matrix methodLLmat <- example_loglik_matrix()rel_n_eff <- relative_eff(exp(LLmat), chain_id = rep(1:2, each = 500))loo(LLmat, r_eff = rel_n_eff, cores = 2)### Using log-likelihood function instead of array or matrixset.seed(124)# Simulate data and draw from posteriorN <- 50; K <- 10; S <- 100; a0 <- 3; b0 <- 2p <- rbeta(1, a0, b0)y <- rbinom(N, size = K, prob = p)a <- a0 + sum(y); b <- b0 + N * K - sum(y)fake_posterior <- as.matrix(rbeta(S, a, b))dim(fake_posterior) # S x 1fake_data <- data.frame(y,K)dim(fake_data) # N x 2llfun <- function(data_i, draws) {  # each time called internally within loo the arguments will be equal to:  # data_i: ith row of fake_data (fake_data[i,, drop=FALSE])  # draws: entire fake_posterior matrix  dbinom(data_i$y, size = data_i$K, prob = draws, log = TRUE)}# Use the loo_i function to check that llfun works on a single observation# before running on all obs. For example, using the 3rd obs in the data:loo_3 <- loo_i(i = 3, llfun = llfun, data = fake_data, draws = fake_posterior)print(loo_3$pointwise[, "elpd_loo"])# Use loo.function method (default r_eff=1 is used as this posterior not obtained via MCMC)loo_with_fn <- loo(llfun, draws = fake_posterior, data = fake_data)# If we look at the elpd_loo contribution from the 3rd obs it should be the# same as what we got above with the loo_i function and i=3:print(loo_with_fn$pointwise[3, "elpd_loo"])print(loo_3$pointwise[, "elpd_loo"])# Check that the loo.matrix method gives same answer as loo.function methodlog_lik_matrix <- sapply(1:N, function(i) {  llfun(data_i = fake_data[i,, drop=FALSE], draws = fake_posterior)})loo_with_mat <- loo(log_lik_matrix)all.equal(loo_with_mat$estimates, loo_with_fn$estimates) # should be TRUE!## Not run: ### For package developers: defining loo methods# An example of a possible loo method for 'stanfit' objects (rstan package).# A similar method is included in the rstan package.# In order for users to be able to call loo(stanfit) instead of# loo.stanfit(stanfit) the NAMESPACE needs to be handled appropriately# (roxygen2 and devtools packages are good for that).#loo.stanfit <- function(x,         pars = "log_lik",         ...,         save_psis = FALSE,         cores = getOption("mc.cores", 1)) {  stopifnot(length(pars) == 1L)  LLarray <- loo::extract_log_lik(stanfit = x,                                  parameter_name = pars,                                  merge_chains = FALSE)  r_eff <- loo::relative_eff(x = exp(LLarray), cores = cores)  loo::loo.array(LLarray,                 r_eff = r_eff,                 cores = cores,                 save_psis = save_psis)}## End(Not run)

Datasets for loo examples and vignettes

Description

Small datasets for use inloo examples and vignettes. TheKlineandmilk datasets are also included in therethinking package(McElreath, 2016a), but we include them here asrethinking is noton CRAN.

Details

Currently the data sets included are:

References

Hinde and Milligan. 2011.Evolutionary Anthropology 20:9-23.

Kline, M.A. and R. Boyd. 2010.Proc R Soc B 277:2559-2564.

McElreath, R. (2016a). rethinking: Statistical Rethinking book package.R package version 1.59.

McElreath, R. (2016b).Statistical rethinking: A Bayesian course withexamples in R and Stan. Chapman & Hall/CRC.

A. Tsanas, M.A. Little, C. Fox, L.O. Ramig: Objective automatic assessment ofrehabilitative speech treatment in Parkinson's disease, IEEETransactions on Neural Systems and Rehabilitation Engineering, Vol. 22, pp.181-190, January 2014

Examples

str(Kline)str(milk)

LOO package glossary

Description

The pages provides definitions to key terms. Also see theFAQ page ontheloo website for answers to frequently asked questions.

Note: VGG2017 refers to Vehtari, Gelman, and Gabry (2017). SeeReferences, below.

ELPD andelpd_loo

The ELPD is the theoretical expected log pointwise predictive density for a newdataset (Eq 1 in VGG2017), which can be estimated, e.g., usingcross-validation.elpd_loo is the Bayesian LOO estimate of theexpected log pointwise predictive density (Eq 4 in VGG2017) andis a sum of N individual pointwise log predictive densities. Probabilitydensities can be smaller or larger than 1, and thus log predictive densitiescan be negative or positive. For simplicity the ELPD acronym is used also forexpected log pointwise predictive probabilities for discrete models.Probabilities are always equal or less than 1, and thus log predictiveprobabilities are 0 or negative.

Standard error ofelpd_loo

Aselpd_loo is defined as the sum of N independent components (Eq 4 inVGG2017), we can compute the standard error by using the standard deviationof the N components and multiplying bysqrt(N) (Eq 23 in VGG2017).This standard error is a coarse description of our uncertainty about thepredictive performance for unknown future data. When N is small or there issevere model misspecification, the current SE estimate is overoptimistic andthe actual SE can even be twice as large. Even for moderate N, when the SEestimate is an accurate estimate for the scale, it ignores the skewness. Whenmaking model comparisons, the SE of the component-wise (pairwise) differencesshould be used instead (see these_diff section below and Eq 24 inVGG2017). Sivula et al. (2022) discuss the conditions when the normalapproximation used for SE andse_diff is good.

Monte Carlo SE of elpd_loo

The Monte Carlo standard error is the estimate for the computational accuracyof MCMC and importance sampling used to computeelpd_loo. Usually thisis negligible compared to the standard describing the uncertainty due tofinite number of observations (Eq 23 in VGG2017).

p_loo (effective number of parameters)

p_loo is the difference betweenelpd_loo and the non-cross-validatedlog posterior predictive density. It describes how much more difficult itis to predict future data than the observed data. Asymptotically undercertain regularity conditions,p_loo can be interpreted as theeffective number of parameters. In well behaving casesp_loo < N andp_loo < p, wherep is the total number of parameters in themodel.p_loo > N orp_loo > p indicates that the model has veryweak predictive capability and may indicate a severe model misspecification.See below for more on interpretingp_loo when there are warningsabout high Pareto k diagnostic values.

Pareto k estimates

The Paretok estimate is a diagnostic for Pareto smoothed importancesampling (PSIS), which is used to compute components ofelpd_loo. Inimportance-sampling LOO the full posterior distribution is used as theproposal distribution. The Pareto k diagnostic estimates how far anindividual leave-one-out distribution is from the full distribution. Ifleaving out an observation changes the posterior too much then importancesampling is not able to give a reliable estimate. Pareto smoothing stabilizesimportance sampling and guarantees a finite variance estimate at thecost of some bias.

The diagnostic threshold for Paretok depends on sample sizeS (sample size dependent threshold was introduced by Vehtariet al., 2024, and before that fixed thresholds of 0.5 and 0.7 wererecommended). For simplicity,loo package uses the nominal samplesizeS when computing the sample size specificthreshold. This provides an optimistic threshold if the effectivesample size is less than 2200, but even then if ESS/S > 1/2 the differenceis usually negligible. Thinning of MCMC draws can be used to improvethe ratio ESS/S.

Paretok is also useful as a measure of influence of anobservation. Highly influential observations have highkvalues. Very highk values often indicate modelmisspecification, outliers or mistakes in data processing. SeeSection 6 of Gabry et al. (2019) for an example.

Interpretingp_loo when Paretok is large

Ifk > 0.7 then we can also look atthep_loo estimate for some additional information about the problem:

elpd_diff

elpd_diff is the difference inelpd_loo for two models. If morethan two models are compared, the difference is computed relative to themodel with highestelpd_loo.

se_diff

The standard error of component-wise differences of elpd_loo (Eq 24 inVGG2017) between two models. This SE issmaller than the SE forindividual models due to correlation (i.e., if some observations are easierand some more difficult to predict for all models).

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

Sivula, T, Magnusson, M., Matamoros A. A., and Vehtari,A. (2022). Uncertainty in Bayesian leave-one-outcross-validation based model comparison.preprint arXiv:2008.10296v3..

Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. andGelman, A. (2019), Visualization in Bayesian workflow.J. R. Stat. Soc. A, 182: 389-402. doi:10.1111/rssa.12378(journal version,preprint arXiv:1709.01449,code on GitHub)


Efficient approximate leave-one-out cross-validation (LOO) for posteriorapproximations

Description

Efficient approximate leave-one-out cross-validation (LOO) for posteriorapproximations

Usage

loo_approximate_posterior(x, log_p, log_g, ...)## S3 method for class 'array'loo_approximate_posterior(  x,  log_p,  log_g,  ...,  save_psis = FALSE,  cores = getOption("mc.cores", 1))## S3 method for class 'matrix'loo_approximate_posterior(  x,  log_p,  log_g,  ...,  save_psis = FALSE,  cores = getOption("mc.cores", 1))## S3 method for class ''function''loo_approximate_posterior(  x,  ...,  data = NULL,  draws = NULL,  log_p = NULL,  log_g = NULL,  save_psis = FALSE,  cores = getOption("mc.cores", 1))

Arguments

x

A log-likelihood array, matrix, or function.TheMethods (by class) section, below, has detailed descriptions of howto specify the inputs for each method.

log_p

The log-posterior (target) evaluated at S samples from theproposal distribution (g). A vector of length S.

log_g

The log-density (proposal) evaluated at S samples from theproposal distribution (g). A vector of length S.

save_psis

Should the"psis" object created internally byloo_approximate_posterior() be saved in the returned object? Seeloo() for details.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

data,draws,...

For theloo_approximate_posterior.function() method,these are the data, posterior draws, and other arguments to pass to thelog-likelihood function. See theMethods (by class) section below fordetails on how to specify these arguments.

Details

Theloo_approximate_posterior() function is an S3 generic andmethods are provided for 3-D pointwise log-likelihood arrays, pointwiselog-likelihood matrices, and log-likelihood functions. The implementationworks for posterior approximations where it is possible to compute the logdensity for the posterior approximation.

Value

Theloo_approximate_posterior() methods return a named list withclassc("psis_loo_ap", "psis_loo", "loo"). It has the same structureas the objects returned byloo() but with the additional slot:

posterior_approximation

A list with two vectors,log_p andlog_g of the same lengthcontaining the posterior density and the approximation densityfor the individual draws.

Methods (by class)

References

Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2019).Leave-One-Out Cross-Validation for Large Data.InThirty-sixth International Conference on Machine Learning,PMLR 97:4244-4253.

Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2020).Leave-One-Out Cross-Validation for Model Comparison in Large Data.InProceedings of the 23rd International Conference on ArtificialIntelligence and Statistics (AISTATS), PMLR 108:341-351.

See Also

loo(),psis(),loo_compare()


Model comparison

Description

Compare fitted models based onELPD.

By default the print method shows only the most important information. Useprint(..., simplify=FALSE) to print a more detailed summary.

Usage

loo_compare(x, ...)## Default S3 method:loo_compare(x, ...)## S3 method for class 'compare.loo'print(x, ..., digits = 1, simplify = TRUE)## S3 method for class 'compare.loo_ss'print(x, ..., digits = 1, simplify = TRUE)

Arguments

x

An object of class"loo" or a list of such objects. If a list isused then the list names will be used as the model names in the output. SeeExamples.

...

Additional objects of class"loo", if not passed in as a singlelist.

digits

For the print method only, the number of digits to use whenprinting.

simplify

For the print method only, should only the essential columnsof the summary matrix be printed? The entire matrix is always returned, butby default only the most important columns are printed.

Details

When comparing two fitted models, we can estimate the difference in theirexpected predictive accuracy by the difference inelpd_loo orelpd_waic (or multiplied by-2, ifdesired, to be on the deviance scale).

When usingloo_compare(), the returned matrix will have one row per modeland several columns of estimates. The values in theelpd_diff andse_diff columns of thereturned matrix are computed by making pairwise comparisons between eachmodel and the model with the largest ELPD (the model in the first row). Forthis reason theelpd_diff column will always have the value0 in thefirst row (i.e., the difference between the preferred model and itself) andnegative values in subsequent rows for the remaining models.

To compute the standard error of the difference inELPD —which should not be expected to equal the difference of the standard errors— we use a paired estimate to take advantage of the fact that the sameset ofN data points was used to fit both models. These calculationsshould be most useful whenN is large, because then non-normality ofthe distribution is not such an issue when estimating the uncertainty inthese sums. These standard errors, for all their flaws, should give abetter sense of uncertainty than what is obtained using the currentstandard approach of comparing differences of deviances to a Chi-squareddistribution, a practice derived for Gaussian linear models orasymptotically, and which only applies to nested models in any case.Sivula et al. (2022) discuss the conditions when the normalapproximation used for SE andse_diff is good.

If more than11 models are compared, we internally recompute the modeldifferences using the median model by ELPD as the baseline model. We thenestimate whether the differences in predictive performance are potentiallydue to chance as described by McLatchie and Vehtari (2023). This will flaga warning if it is deemed that there is a risk of over-fitting due to theselection process. In that case users are recommended to avoid modelselection based on LOO-CV, and instead to favor model averaging/stacking orprojection predictive inference.

Value

A matrix with class"compare.loo" that has its ownprint method. See theDetails section.

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

Sivula, T, Magnusson, M., Matamoros A. A., and Vehtari, A. (2022).Uncertainty in Bayesian leave-one-out cross-validation based modelcomparison.preprint arXiv:2008.10296v3..

McLatchie, Y., and Vehtari, A. (2023). Efficient estimation andcorrection of selection-induced bias with order statistics.preprint arXiv:2309.03742

See Also

Examples

# very artificial example, just for demonstration!LL <- example_loglik_array()loo1 <- loo(LL)     # should be worst model when comparedloo2 <- loo(LL + 1) # should be second best model when comparedloo3 <- loo(LL + 2) # should be best model when comparedcomp <- loo_compare(loo1, loo2, loo3)print(comp, digits = 2)# show more details with simplify=FALSE# (will be the same for all models in this artificial example)print(comp, simplify = FALSE, digits = 3)# can use a list of objects with custom names# will use apple, banana, and cherry, as the names in the outputloo_compare(list("apple" = loo1, "banana" = loo2, "cherry" = loo3))## Not run: # works for waic (and kfold) tooloo_compare(waic(LL), waic(LL - 10))## End(Not run)

Model averaging/weighting via stacking or pseudo-BMA weighting

Description

Model averaging via stacking of predictive distributions, pseudo-BMAweighting or pseudo-BMA+ weighting with the Bayesian bootstrap. See Yao etal. (2018), Vehtari, Gelman, and Gabry (2017), and Vehtari, Simpson,Gelman, Yao, and Gabry (2024) for background.

Usage

loo_model_weights(x, ...)## Default S3 method:loo_model_weights(  x,  ...,  method = c("stacking", "pseudobma"),  optim_method = "BFGS",  optim_control = list(),  BB = TRUE,  BB_n = 1000,  alpha = 1,  r_eff_list = NULL,  cores = getOption("mc.cores", 1))stacking_weights(lpd_point, optim_method = "BFGS", optim_control = list())pseudobma_weights(lpd_point, BB = TRUE, BB_n = 1000, alpha = 1)

Arguments

x

A list of"psis_loo" objects (objects returned byloo()) orpointwise log-likelihood matrices or , one for each model. If the listelements are named the names will be used to label the models in theresults. Each matrix/object should have dimensionsS byN,whereS is the size of the posterior sample (with all chains merged)andN is the number of data points. Ifx is a list oflog-likelihood matrices thenloo() is called internally on each matrix.Currently theloo_model_weights() function is not implemented to be usedwith results from K-fold CV, but you can still obtain weights using K-foldCV results by calling thestacking_weights() function directly.

...

Unused, except for the generic to pass arguments to individualmethods.

method

Either"stacking" (the default) or"pseudobma", indicating which methodto use for obtaining the weights."stacking" refers to stacking ofpredictive distributions and"pseudobma" refers to pseudo-BMA+ weighting(or plain pseudo-BMA weighting if argumentBB isFALSE).

optim_method

Ifmethod="stacking", a string passed to themethodargument ofstats::constrOptim() to specify the optimization algorithm.The default isoptim_method="BFGS", but other options are available (seestats::optim()).

optim_control

Ifmethod="stacking", a list of control parameters foroptimization passed to thecontrol argument ofstats::constrOptim().

BB

Logical used when"method"="pseudobma". IfTRUE (the default), the Bayesian bootstrap will be used to adjustthe pseudo-BMA weighting, which is called pseudo-BMA+ weighting. It helpsregularize the weight away from 0 and 1, so as to reduce the variance.

BB_n

For pseudo-BMA+ weighting only, the number of samples to use forthe Bayesian bootstrap. The default isBB_n=1000.

alpha

Positive scalar shape parameter in the Dirichlet distributionused for the Bayesian bootstrap. The default isalpha=1, whichcorresponds to a uniform distribution on the simplex space.

r_eff_list

Optionally, a list of relative effective sample sizeestimates for the likelihood(exp(log_lik)) of each observation ineach model. Seepsis() andrelative_eff() helperfunction for computingr_eff. Ifx is a list of"psis_loo"objects thenr_eff_list is ignored.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

lpd_point

If callingstacking_weights() orpseudobma_weights()directly, a matrix of pointwise leave-one-out (or K-fold) log likelihoodsevaluated for different models. It should be aN byK matrixwhereN is sample size andK is the number of models. Eachcolumn corresponds to one model. These values can be calculatedapproximately usingloo() or by running exact leave-one-out or K-foldcross-validation.

Details

loo_model_weights() is a wrapper around thestacking_weights() andpseudobma_weights() functions that implements stacking, pseudo-BMA, andpseudo-BMA+ weighting for combining multiple predictive distributions. We canuse approximate or exact leave-one-out cross-validation (LOO-CV) or K-fold CVto estimate the expected log predictive density (ELPD).

The stacking method (method="stacking"), which is the default forloo_model_weights(), combines all models by maximizing the leave-one-outpredictive density of the combination distribution. That is, it finds theoptimal linear combining weights for maximizing the leave-one-out log score.

The pseudo-BMA method (method="pseudobma") finds the relative weightsproportional to the ELPD of each model. However, whenmethod="pseudobma", the default is to also use the Bayesian bootstrap(BB=TRUE), which corresponds to the pseudo-BMA+ method. The Bayesianbootstrap takes into account the uncertainty of finite data points andregularizes the weights away from the extremes of 0 and 1.

In general, we recommend stacking for averaging predictive distributions,while pseudo-BMA+ can serve as a computationally easier alternative.

Value

A numeric vector containing one weight for each model.

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018) Usingstacking to average Bayesian predictive distributions.Bayesian Analysis, advance publication, doi:10.1214/17-BA1091.(online).

See Also

Examples

## Not run: ### Demonstrating usage after fitting models with RStanlibrary(rstan)# generate fake data from N(0,1).N <- 100y <- rnorm(N, 0, 1)# Suppose we have three models: N(-1, sigma), N(0.5, sigma) and N(0.6,sigma).stan_code <- "  data {    int N;    vector[N] y;    real mu_fixed;  }  parameters {    real<lower=0> sigma;  }  model {    sigma ~ exponential(1);    y ~ normal(mu_fixed, sigma);  }  generated quantities {    vector[N] log_lik;    for (n in 1:N) log_lik[n] = normal_lpdf(y[n]| mu_fixed, sigma);  }"mod <- stan_model(model_code = stan_code)fit1 <- sampling(mod, data=list(N=N, y=y, mu_fixed=-1))fit2 <- sampling(mod, data=list(N=N, y=y, mu_fixed=0.5))fit3 <- sampling(mod, data=list(N=N, y=y, mu_fixed=0.6))model_list <- list(fit1, fit2, fit3)log_lik_list <- lapply(model_list, extract_log_lik)# optional but recommendedr_eff_list <- lapply(model_list, function(x) {  ll_array <- extract_log_lik(x, merge_chains = FALSE)  relative_eff(exp(ll_array))})# stacking method:wts1 <- loo_model_weights(  log_lik_list,  method = "stacking",  r_eff_list = r_eff_list,  optim_control = list(reltol=1e-10))print(wts1)# can also pass a list of psis_loo objects to avoid recomputing looloo_list <- lapply(1:length(log_lik_list), function(j) {  loo(log_lik_list[[j]], r_eff = r_eff_list[[j]])})wts2 <- loo_model_weights(  loo_list,  method = "stacking",  optim_control = list(reltol=1e-10))all.equal(wts1, wts2)# can provide names to be used in the resultsloo_model_weights(setNames(loo_list, c("A", "B", "C")))# pseudo-BMA+ method:set.seed(1414)loo_model_weights(loo_list, method = "pseudobma")# pseudo-BMA method (set BB = FALSE):loo_model_weights(loo_list, method = "pseudobma", BB = FALSE)# calling stacking_weights or pseudobma_weights directlylpd1 <- loo(log_lik_list[[1]], r_eff = r_eff_list[[1]])$pointwise[,1]lpd2 <- loo(log_lik_list[[2]], r_eff = r_eff_list[[2]])$pointwise[,1]lpd3 <- loo(log_lik_list[[3]], r_eff = r_eff_list[[3]])$pointwise[,1]stacking_weights(cbind(lpd1, lpd2, lpd3))pseudobma_weights(cbind(lpd1, lpd2, lpd3))pseudobma_weights(cbind(lpd1, lpd2, lpd3), BB = FALSE)## End(Not run)

Moment matching for efficient approximate leave-one-out cross-validation (LOO)

Description

Moment matching algorithm for updating a loo object when Pareto k estimatesare large.

Usage

loo_moment_match(x, ...)## Default S3 method:loo_moment_match(  x,  loo,  post_draws,  log_lik_i,  unconstrain_pars,  log_prob_upars,  log_lik_i_upars,  max_iters = 30L,  k_threshold = NULL,  split = TRUE,  cov = TRUE,  cores = getOption("mc.cores", 1),  ...)

Arguments

x

A fitted model object.

...

Further arguments passed to the custom functions documented above.

loo

A loo object to be modified.

post_draws

A function the takesx as the first argument and returnsa matrix of posterior draws of the model parameters.

log_lik_i

A function that takesx andi and returns a matrix (onecolumn per chain) or a vector (all chains stacked) of log-likelihood drawsof theith observation based on the modelx. If the draws are obtainedusing MCMC, the matrix with MCMC chains separated is preferred.

unconstrain_pars

A function that takes argumentsx, andpars andreturns posterior draws on the unconstrained space based on the posteriordraws on the constrained space passed viapars.

log_prob_upars

A function that takes argumentsx andupars andreturns a matrix of log-posterior density values of the unconstrainedposterior draws passed viaupars.

log_lik_i_upars

A function that takes argumentsx,upars, andiand returns a vector of log-likelihood draws of theith observation basedon the unconstrained posterior draws passed viaupars.

max_iters

Maximum number of moment matching iterations. Usually thisdoes not need to be modified. If the maximum number of iterations isreached, there will be a warning, and increasingmax_iters may improveaccuracy.

k_threshold

Threshold value for Pareto k values above which the momentmatching algorithm is used. The default value is1 - 1 / log10(S),whereS is the sample size.

split

Logical; Indicate whether to do the split transformation or notat the end of moment matching for each LOO fold.

cov

Logical; Indicate whether to match the covariance matrix of thesamples or not. IfFALSE, only the mean and marginal variances arematched.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

Details

Theloo_moment_match() function is an S3 generic and we provide adefault method that takes as arguments user-specified functionspost_draws,log_lik_i,unconstrain_pars,log_prob_upars, andlog_lik_i_upars. All of these functions should take.... as an argumentin addition to those specified for each function.

Value

Theloo_moment_match() methods return an updatedloo object. Thestructure of the updatedloo object is similar, but the method alsostores the original Pareto k diagnostic values in the diagnostics field.

Methods (by class)

References

Paananen, T., Piironen, J., Buerkner, P.-C., Vehtari, A. (2021).Implicitly adaptive importance sampling.Statistics and Computing, 31, 16.doi:10.1007/s11222-020-09982-2. arXiv preprint arXiv:1906.08850.

See Also

loo(),loo_moment_match_split()

Examples

# See the vignette for loo_moment_match()

Split moment matching for efficient approximate leave-one-out cross-validation (LOO)

Description

A function that computes the split moment matching importance sampling loo.Takes in the moment matching total transformation, transforms only halfof the draws, and computes a single elpd using multiple importance sampling.

Usage

loo_moment_match_split(  x,  upars,  cov,  total_shift,  total_scaling,  total_mapping,  i,  log_prob_upars,  log_lik_i_upars,  r_eff_i,  cores,  is_method,  ...)

Arguments

x

A fitted model object.

upars

A matrix containing the model parameters in unconstrained spacewhere they can have any real value.

cov

Logical; Indicate whether to match the covariance matrix of thesamples or not. IfFALSE, only the mean and marginal variances arematched.

total_shift

A vector representing the total shift made by the momentmatching algorithm.

total_scaling

A vector representing the total scaling of marginalvariance made by the moment matching algorithm.

total_mapping

A vector representing the total covariancetransformation made by the moment matching algorithm.

i

Observation index.

log_prob_upars

A function that takes argumentsx andupars andreturns a matrix of log-posterior density values of the unconstrainedposterior draws passed viaupars.

log_lik_i_upars

A function that takes argumentsx,upars, andiand returns a vector of log-likeliood draws of theith observation basedon the unconstrained posterior draws passed viaupars.

r_eff_i

MCMC relative effective sample size of thei'th loglikelihood draws.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

is_method

The importance sampling method to use. The following methodsare implemented:

  • "psis": Pareto-Smoothed Importance Sampling (PSIS). Default method.

  • "tis": Truncated Importance Sampling (TIS) with truncation atsqrt(S), whereS is the number of posterior draws.

  • "sis": Standard Importance Sampling (SIS).

...

Further arguments passed to the custom functions documented above.

Value

A list containing the updated log-importance weights andlog-likelihood values. Also returns the updated MCMC effective sample sizeand the integrand-specific log-importance weights.

References

Paananen, T., Piironen, J., Buerkner, P.-C., Vehtari, A. (2021).Implicitly adaptive importance sampling.Statistics and Computing, 31, 16.doi:10.1007/s11222-020-09982-2. arXiv preprint arXiv:1906.08850.

See Also

loo(),loo_moment_match()


Estimate leave-one-out predictive performance..

Description

Theloo_predictive_metric() function computes estimates of leave-one-outpredictive metrics given a set of predictions and observations. Currentlysupported metrics are mean absolute error, mean squared error and root meansquared error for continuous predictions and accuracy and balanced accuracyfor binary classification. Predictions are passed on to theE_loo()function, so this function assumes that the PSIS approximation is workingwell.

Usage

loo_predictive_metric(x, ...)## S3 method for class 'matrix'loo_predictive_metric(  x,  y,  log_lik,  ...,  metric = c("mae", "rmse", "mse", "acc", "balanced_acc"),  r_eff = 1,  cores = getOption("mc.cores", 1))

Arguments

x

A numeric matrix of predictions.

...

Additional arguments passed on toE_loo()

y

A numeric vector of observations. Length should be equal to thenumber of rows inx.

log_lik

A matrix of pointwise log-likelihoods. Should be of samedimension asx.

metric

The type of predictive metric to be used. Currentlysupported options are"mae","rmse" and"mse" for regression andfor binary classification"acc" and"balanced_acc".

"mae"

Mean absolute error.

"mse"

Mean squared error.

"rmse"

Root mean squared error, given by as the square root ofMSE.

"acc"

The proportion of predictions indicating the correct outcome.

"balanced_acc"

Balanced accuracy is given by the average of true positive and truenegative rates.

r_eff

A Vector of relative effective sample size estimates containingone element per observation. Seepsis() for more details.

cores

The number of cores to use for parallelization of⁠[psis()]⁠.Seepsis() for details.

Value

A list with the following components:

estimate

Estimate of the given metric.

se

Standard error of the estimate.

Examples

if (requireNamespace("rstanarm", quietly = TRUE)) {# Use rstanarm package to quickly fit a model and get both a log-likelihood# matrix and draws from the posterior predictive distributionlibrary("rstanarm")# data from help("lm")ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)d <- data.frame(  weight = c(ctl, trt),  group = gl(2, 10, 20, labels = c("Ctl","Trt")))fit <- stan_glm(weight ~ group, data = d, refresh = 0)ll <- log_lik(fit)r_eff <- relative_eff(exp(-ll), chain_id = rep(1:4, each = 1000))mu_pred <- posterior_epred(fit)# Leave-one-out mean absolute error of predictionsmae <- loo_predictive_metric(x = mu_pred, y = d$weight, log_lik = ll,                            pred_error = 'mae', r_eff = r_eff)# Leave-one-out 90%-quantile of mean absolute errormae_90q <- loo_predictive_metric(x = mu_pred, y = d$weight, log_lik = ll,                                pred_error = 'mae', r_eff = r_eff,                                type = 'quantile', probs = 0.9)}

Efficient approximate leave-one-out cross-validation (LOO) using subsampling,so that less costly and more approximate computation is made for all LOO-fold,and more costly and accurate computations are made only for m<N LOO-folds.

Description

Efficient approximate leave-one-out cross-validation (LOO) using subsampling,so that less costly and more approximate computation is made for all LOO-fold,and more costly and accurate computations are made only for m<N LOO-folds.

Usage

loo_subsample(x, ...)## S3 method for class ''function''loo_subsample(  x,  ...,  data = NULL,  draws = NULL,  observations = 400,  log_p = NULL,  log_g = NULL,  r_eff = 1,  save_psis = FALSE,  cores = getOption("mc.cores", 1),  loo_approximation = "plpd",  loo_approximation_draws = NULL,  estimator = "diff_srs",  llgrad = NULL,  llhess = NULL)

Arguments

x

A function. TheMethods (by class) section, below, has detaileddescriptions of how to specify the inputs.

data,draws,...

Forloo_subsample.function(), these are the data,posterior draws, and other arguments to pass to the log-likelihoodfunction. Note that for someloo_approximations, the draws will be replacedby the posteriors summary statistics to compute loo approximations. Seeargumentloo_approximation for details.

observations

The subsample observations to use. The argument can takefour (4) types of arguments:

  • NULL to use all observations. The algorithm then just usesstandardloo() orloo_approximate_posterior().

  • A single integer to specify the number of observations to be subsampled.

  • A vector of integers to provide the indices used to subset the data.These observations need to be subsampled with the same scheme as given bytheestimator argument.

  • Apsis_loo_ss object to use the same observations that were used in aprevious call toloo_subsample().

log_p,log_g

Should be supplied only if approximate posterior draws areused. The default (NULL) indicates draws are from "true" posterior (i.e.using MCMC). If notNULL then they should be specified as described inloo_approximate_posterior().

r_eff

Vector of relative effective sample size estimates for thelikelihood (exp(log_lik)) of each observation. This is related tothe relative efficiency of estimating the normalizing term inself-normalized importance sampling when using posterior draws obtainedwith MCMC. If MCMC draws are used andr_eff is not provided thenthe reported PSIS effective sample sizes and Monte Carlo error estimatescan be over-optimistic. If the posterior draws are (near) independent thenr_eff=1 can be used.r_eff has to be a scalar (same value is usedfor all observations) or a vector with length equal to the number ofobservations. The default value is 1. See therelative_eff() helperfunctions for help computingr_eff.

save_psis

Should the"psis" object created internally byloo_subsample() be saved in the returned object? Seeloo() for details.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

loo_approximation

What type of approximation of the loo_i's should be used?The default is"plpd" (the log predictive density using the posterior expectation).There are six different methods implemented to approximate loo_i's(see the references for more details):

  • "plpd": uses the lpd based on point estimates (i.e.,p(y_i|\hat{\theta})).

  • "lpd": uses the lpds (i,e.,p(y_i|y)).

  • "tis": uses truncated importance sampling to approximate PSIS-LOO.

  • "waic": uses waic (i.e.,p(y_i|y) - p_{waic}).

  • "waic_grad_marginal": uses waic approximation using first order deltamethod and posterior marginal variances to approximatep_{waic} (ie.p(y_i|\hat{\theta})-p_waic_grad_marginal). Requires gradient oflikelihood function.

  • "waic_grad": uses waic approximation using first order delta method andposterior covariance to approximatep_{waic} (ie.p(y_i|\hat{\theta})-p_waic_grad). Requires gradient of likelihoodfunction.

  • "waic_hess": uses waic approximation using second order delta method andposterior covariance to approximatep_{waic} (ie.p(y_i|\hat{\theta})-p_waic_grad). Requires gradient and Hessian oflikelihood function.

As point estimates of\hat{\theta}, the posterior expectationsof the parameters are used.

loo_approximation_draws

The number of posterior draws used whenintegrating over the posterior. This is used ifloo_approximation is setto"lpd","waic", or"tis".

estimator

How shouldelpd_loo,p_loo andlooic be estimated?The default is"diff_srs".

  • "diff_srs": uses the difference estimator with simple random samplingwithout replacement (srs).p_loo is estimated using standard srs.(Magnusson et al., 2020)

  • "hh": uses the Hansen-Hurwitz estimator with sampling with replacementproportional to size, whereabs of loo_approximation is used as size.(Magnusson et al., 2019)

  • "srs": uses simple random sampling and ordinary estimation.

llgrad

The gradient of the log-likelihood. Thisis only used whenloo_approximation is"waic_grad","waic_grad_marginal", or"waic_hess". The default isNULL.

llhess

The Hessian of the log-likelihood. This is only usedwithloo_approximation = "waic_hess". The default isNULL.

Details

Theloo_subsample() function is an S3 generic and a methods iscurrently provided for log-likelihood functions. The implementation worksfor both MCMC and for posterior approximations where it is possible tocompute the log density for the approximation.

Value

loo_subsample() returns a named list with classc("psis_loo_ss", "psis_loo", "loo"). This has the same structure as objects returned byloo() but with the additional slot:

Methods (by class)

References

Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2019).Leave-One-Out Cross-Validation for Large Data.InThirty-sixth International Conference on Machine Learning,PMLR 97:4244-4253.

Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2020).Leave-One-Out Cross-Validation for Model Comparison in Large Data.InProceedings of the 23rd International Conference on ArtificialIntelligence and Statistics (AISTATS), PMLR 108:341-351.

See Also

loo(),psis(),loo_compare()


Named lists

Description

Create a named list using specified names or, if names are omitted, using thenames of the objects in the list. The codelist(a = a, b = b) becomesnlist(a,b) andlist(a = a, b = 2) becomesnlist(a, b = 2), etc.

Usage

nlist(...)

Arguments

...

Objects to include in the list.

Value

A named list.

Examples

# All variables already defineda <- rnorm(100)b <- mat.or.vec(10, 3)nlist(a,b)# Define some variables in the call and take the rest from the environmentnlist(a, b, veggies = c("lettuce", "spinach"), fruits = c("banana", "papaya"))

The number of observations in apsis_loo_ss object.

Description

The number of observations in apsis_loo_ss object.

Usage

## S3 method for class 'psis_loo_ss'nobs(object, ...)

Arguments

object

apsis_loo_ss object.

...

Currently unused.


Get observation indices used in subsampling

Description

Get observation indices used in subsampling

Usage

obs_idx(x, rep = TRUE)

Arguments

x

Apsis_loo_ss object.

rep

If sampling with replacement is used, an observation can havemultiple samples and these are then repeated in the returned object ifrep=TRUE (e.g., a vectorc(1,1,2) indicates that observation 1 has beensubampled two times). Ifrep=FALSE only the unique indices are returned.

Value

An integer vector.


Extractor methods

Description

These are only defined in order to deprecate with a warning (rather thanremove and break backwards compatibility) the old way of accessing the pointestimates in a"psis_loo" or"psis" object. The new way as ofv2.0.0 is to get them from the"estimates" component of the object.

Usage

## S3 method for class 'loo'x[i]## S3 method for class 'loo'x[[i, exact = TRUE]]## S3 method for class 'loo'x$name

Arguments

x,i,exact,name

SeeExtract.


Parallel psis list computations

Description

Parallel psis list computations

Usage

parallel_psis_list(  N,  .loo_i,  .llfun,  data,  draws,  r_eff,  save_psis,  cores,  ...)parallel_importance_sampling_list(  N,  .loo_i,  .llfun,  data,  draws,  r_eff,  save_psis,  cores,  method,  ...)

Arguments

N

The total number of observations (i.e.nrow(data)).

.loo_i

The function used to compute individual loo contributions.

.llfun

Seellfun inloo.function().

data,draws,...

For theloo.function() method and theloo_i()function, these are the data, posterior draws, and other arguments to passto the log-likelihood function. See theMethods (by class) sectionbelow for details on how to specify these arguments.

r_eff

Vector of relative effective sample size estimates for thelikelihood (exp(log_lik)) of each observation. This is related tothe relative efficiency of estimating the normalizing term inself-normalized importance sampling when using posterior draws obtainedwith MCMC. If MCMC draws are used andr_eff is not provided thenthe reported PSIS effective sample sizes and Monte Carlo error estimatescan be over-optimistic. If the posterior draws are (near) independent thenr_eff=1 can be used.r_eff has to be a scalar (same value is usedfor all observations) or a vector with length equal to the number ofobservations. The default value is 1. See therelative_eff() helperfunctions for help computingr_eff.

save_psis

Should thepsis object created internally byloo() besaved in the returned object? Theloo() function callspsis()internally but by default discards the (potentially large)psis objectafter using it to compute the LOO-CV summaries. Settingsave_psis=TRUEwill add apsis_object component to the list returned byloo.This is useful if you plan to use theE_loo() function to computeweighted expectations after runningloo. Several functions in thebayesplot package also acceptpsis objects.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

method

Seeis_method forloo()

Details

Refactored function to handle parallel computationsfor psis_list


Diagnostics for Pareto smoothed importance sampling (PSIS)

Description

Print a diagnostic table summarizing the estimated Pareto shape parametersand PSIS effective sample sizes, find the indexes of observations for whichthe estimated Pareto shape parameterk is larger than somethreshold value, or plot observation indexes vs. diagnostic estimates.TheDetails section below provides a brief overview of thediagnostics, but we recommend consulting Vehtari, Gelman, and Gabry (2017)and Vehtari, Simpson, Gelman, Yao, and Gabry (2024) for full details.

Usage

pareto_k_table(x)pareto_k_ids(x, threshold = NULL)pareto_k_values(x)pareto_k_influence_values(x)psis_n_eff_values(x)mcse_loo(x, threshold = NULL)## S3 method for class 'psis_loo'plot(  x,  diagnostic = c("k", "ESS", "n_eff"),  ...,  label_points = FALSE,  main = "PSIS diagnostic plot")## S3 method for class 'psis'plot(  x,  diagnostic = c("k", "ESS", "n_eff"),  ...,  label_points = FALSE,  main = "PSIS diagnostic plot")

Arguments

x

An object created byloo() orpsis().

threshold

Forpareto_k_ids(),threshold is the minimumkvalue to flag (default is a sample sizeS dependend threshold1 - 1 / log10(S)). Formcse_loo(), if anyk estimates aregreater thanthreshold the MCSE estimate is returned asNASeeDetails for the motivation behind these defaults.

diagnostic

For theplot method, which diagnostic should beplotted? The options are"k" for Paretok estimates (thedefault), or"ESS" or"n_eff" for PSIS effective sample size estimates.

label_points,...

For theplot() method, iflabel_points isTRUE the observation numbers corresponding to any values ofkgreater than the diagnostic threshold will be displayed in the plot.Any arguments specified in... will be passed tographics::text()and can be used to control the appearance of the labels.

main

For theplot() method, a title for the plot.

Details

The reliability and approximate convergence rate of the PSIS-basedestimates can be assessed using the estimates for the shapeparameterk of the generalized Pareto distribution. Thediagnostic threshold for Paretok depends on sample sizeS (sample size dependent threshold was introduced by Vehtariet al. (2024), and before that fixed thresholds of 0.5 and 0.7 wererecommended). For simplicity,loo package uses the nominal samplesizeS when computing the sample size specificthreshold. This provides an optimistic threshold if the effectivesample size is less than 2200, but if MCMC-ESS > S/2 the differenceis usually negligible. Thinning of MCMC draws can be used toimprove the ratio ESS/S.

What if the estimated tail shape parameterkexceeds the diagnostic threshold?

Importance sampling is likely towork less well if the marginal posteriorp(\theta^s | y) andLOO posteriorp(\theta^s | y_{-i}) are very different, whichis more likely to happen with a non-robust model and highlyinfluential observations. If the estimated tail shape parameterk exceeds the diagnostic threshold, the user should bewarned. (Note: Ifk is greater than the diagnostic thresholdthen WAIC is also likely to fail, but WAIC lacks as accuratediagnostic.) When using PSIS in the context of approximate LOO-CV,we recommend one of the following actions:

Observation influence statistics

The estimated shape parameterk for each observation can be used as a measure of the observation'sinfluence on posterior distribution of the model. These can be obtained withpareto_k_influence_values().

Effective sample size and error estimates

In the case that weobtain the samples from the proposal distribution via MCMC theloopackage also computes estimates for the Monte Carlo error and the effectivesample size for importance sampling, which are more accurate for PSIS thanfor IS and TIS (see Vehtari et al (2024) for details). However, the PSISeffective sample size estimate will beover-optimistic when the estimate ofk is greater thanmin(1-1/log10(S), 0.7), whereS is the sample size.

Value

pareto_k_table() returns an object of class"pareto_k_table", which is a matrix with columns"Count","Proportion", and"Min. n_eff", and has its own print method.

pareto_k_ids() returns an integer vector indicating whichobservations have Paretok estimates abovethreshold.

pareto_k_values() returns a vector of the estimated Paretok parameters. These represent the reliability of sampling.

pareto_k_influence_values() returns a vector of the estimated Paretok parameters. These represent influence of the observations on themodel posterior distribution.

psis_n_eff_values() returns a vector of the estimated PSISeffective sample sizes.

mcse_loo() returns the Monte Carlo standard error (MCSE)estimate for PSIS-LOO. MCSE will be NA if any Paretok values areabovethreshold.

Theplot() method is called for its side effect and does notreturn anything. Ifx is the result of a call toloo()orpsis() thenplot(x, diagnostic) produces a plot ofthe estimates of the Pareto shape parameters (diagnostic = "k") orestimates of the PSIS effective sample sizes (diagnostic = "ESS").

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

See Also


Convenience function for extracting pointwise estimates

Description

Convenience function for extracting pointwise estimates

Usage

pointwise(x, estimate, ...)## S3 method for class 'loo'pointwise(x, estimate, ...)

Arguments

x

Aloo object, for example one returned byloo(),loo_subsample(),loo_approximate_posterior(),loo_moment_match(), etc.

estimate

Which pointwise estimate to return. By default all arereturned. The objects returned by the different functions (loo(),loo_subsample(), etc.) have slightly different estimates available.Typically at a minimum the estimateselpd_loo,looic,mcse_elpd_loo,p_loo, andinfluence_pareto_k will be available, but there may beothers.

...

Currently ignored.

Value

A vector of length equal to the number of observations.

Examples

x <- loo(example_loglik_array())pointwise(x, "elpd_loo")

Print methods

Description

Print methods

Usage

## S3 method for class 'loo'print(x, digits = 1, ...)## S3 method for class 'waic'print(x, digits = 1, ...)## S3 method for class 'psis_loo'print(x, digits = 1, plot_k = FALSE, ...)## S3 method for class 'importance_sampling_loo'print(x, digits = 1, plot_k = FALSE, ...)## S3 method for class 'psis_loo_ap'print(x, digits = 1, plot_k = FALSE, ...)## S3 method for class 'psis'print(x, digits = 1, plot_k = FALSE, ...)## S3 method for class 'importance_sampling'print(x, digits = 1, plot_k = FALSE, ...)

Arguments

x

An object returned byloo(),psis(), orwaic().

digits

An integer passed tobase::round().

...

Arguments passed toplot.psis_loo() ifplot_k isTRUE.

plot_k

Logical. IfTRUE the estimates of the Pareto shapeparameterk are plotted. Ignored ifx was generated bywaic(). To just plotk without printing use theplot() method for 'loo' objects.

Value

x, invisibly.

See Also

pareto-k-diagnostic


Description

Print dimensions of log-likelihood or log-weights matrix

Usage

print_dims(x, ...)## S3 method for class 'importance_sampling'print_dims(x, ...)## S3 method for class 'psis_loo'print_dims(x, ...)## S3 method for class 'importance_sampling_loo'print_dims(x, ...)## S3 method for class 'waic'print_dims(x, ...)## S3 method for class 'kfold'print_dims(x, ...)## S3 method for class 'psis_loo_ss'print_dims(x, ...)

Arguments

x

The object returned bypsis(),loo(), orwaic().

...

Ignored.


Pareto smoothed importance sampling (PSIS)

Description

Implementation of Pareto smoothed importance sampling (PSIS), a method forstabilizing importance ratios. The version of PSIS implemented herecorresponds to the algorithm presented in Vehtari, Simpson, Gelman, Yao,and Gabry (2024).For PSIS diagnostics see thepareto-k-diagnostic page.

Usage

psis(log_ratios, ...)## S3 method for class 'array'psis(log_ratios, ..., r_eff = 1, cores = getOption("mc.cores", 1))## S3 method for class 'matrix'psis(log_ratios, ..., r_eff = 1, cores = getOption("mc.cores", 1))## Default S3 method:psis(log_ratios, ..., r_eff = 1)is.psis(x)is.sis(x)is.tis(x)

Arguments

log_ratios

An array, matrix, or vector of importance ratios on the logscale (for PSIS-LOO these arenegative log-likelihood values). See theMethods (by class) section below for a detailed description of howto specify the inputs for each method.

...

Arguments passed on to the various methods.

r_eff

Vector of relative effective sample size estimates containingone element per observation. The values provided should be the relativeeffective sample sizes of1/exp(log_ratios) (i.e.,1/ratios).This is related to the relative efficiency of estimating the normalizingterm in self-normalizing importance sampling. Ifr_eff is notprovided then the reported PSIS effective sample sizes and Monte Carloerror estimates can be over-optimistic. If the posterior draws are (near)independent thenr_eff=1 can be used.r_eff has to be a scalar (samevalue is used for all observations) or a vector with length equal to thenumber of observations. The default value is 1. See therelative_eff()helper function for computingr_eff.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

x

Foris.psis(), an object to check.

Value

Thepsis() methods return an object of class"psis",which is a named list with the following components:

log_weights

Vector or matrix of smoothed (and truncated) butunnormalized logweights. To get normalized weights use theweights() method provided for objects ofclass"psis".

diagnostics

A named list containing two vectors:

  • pareto_k: Estimates of the shape parameterk of thegeneralized Pareto distribution. See thepareto-k-diagnosticpage for details.

  • n_eff: PSIS effective sample size estimates.

Objects of class"psis" also have the followingattributes:

norm_const_log

Vector of precomputed values ofcolLogSumExps(log_weights) that areused internally by theweights method to normalize the log weights.

tail_len

Vector of tail lengths used for fitting the generalized Pareto distribution.

r_eff

If specified, the user'sr_eff argument.

dims

Integer vector of length 2 containingS (posterior sample size)andN (number of observations).

method

Method used for importance sampling, herepsis.

Methods (by class)

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

See Also

Examples

log_ratios <- -1 * example_loglik_array()r_eff <- relative_eff(exp(-log_ratios))psis_result <- psis(log_ratios, r_eff = r_eff)str(psis_result)plot(psis_result)# extract smoothed weightslw <- weights(psis_result) # default args are log=TRUE, normalize=TRUEulw <- weights(psis_result, normalize=FALSE) # unnormalized log-weightsw <- weights(psis_result, log=FALSE) # normalized weights (not log-weights)uw <- weights(psis_result, log=FALSE, normalize = FALSE) # unnormalized weights

Diagnostics for Laplace and ADVI approximations and Laplace-loo and ADVI-loo

Description

Diagnostics for Laplace and ADVI approximations and Laplace-loo and ADVI-loo

Usage

psis_approximate_posterior(  log_p = NULL,  log_g = NULL,  log_liks = NULL,  cores,  save_psis,  ...,  log_q = NULL)

Arguments

log_p

The log-posterior (target) evaluated at S samples from theproposal distribution (g). A vector of length S.

log_g

The log-density (proposal) evaluated at S samples from theproposal distribution (g). A vector of length S.

log_liks

A log-likelihood matrix of size S * N, where N is the numberof observations and S is the number of samples from q. Seeloo.matrix() for details. Default isNULL. Then only theposterior is evaluated using the k_hat diagnostic.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

save_psis

Should thepsis object created internally byloo() besaved in the returned object? Theloo() function callspsis()internally but by default discards the (potentially large)psis objectafter using it to compute the LOO-CV summaries. Settingsave_psis=TRUEwill add apsis_object component to the list returned byloo.This is useful if you plan to use theE_loo() function to computeweighted expectations after runningloo. Several functions in thebayesplot package also acceptpsis objects.

log_q

Deprecated argument name (the same as log_g).

Value

If log likelihoods are supplied, the function returns a"loo" object,otherwise the function returns a"psis" object.

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

See Also

loo() andpsis()


Pareto smoothed importance sampling (deprecated, old version)

Description

As of version⁠2.0.0⁠ this function isdeprecated. Please use thepsis() function for the new PSIS algorithm.

Usage

psislw(  lw,  wcp = 0.2,  wtrunc = 3/4,  cores = getOption("mc.cores", 1),  llfun = NULL,  llargs = NULL,  ...)

Arguments

lw

A matrix or vector of log weights. For computing LOO,lw = -log_lik, thenegative of anS (simulations) byN (datapoints) pointwise log-likelihood matrix.

wcp

The proportion of importance weights to use for the generalizedPareto fit. The100*wcp\from which to estimate the parameters of the generalized Paretodistribution.

wtrunc

For truncating very large weights toS^wtrunc. Setto zero for no truncation.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER), the old optionloo.cores is nowdeprecated but will be given precedence overmc.cores until it isremoved.As of version 2.0.0, the default is now 1 core ifmc.cores is not set, but we recommend using as many (or close to asmany) cores as possible.

llfun,llargs

Seeloo.function().

...

Ignored whenpsislw() is called directly. The... isonly used internally whenpsislw() is called by theloo()function.

Value

A named list with componentslw_smooth (modified log weights) andpareto_k (estimated generalized Pareto shape parameter(s) k).

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

See Also

pareto-k-diagnostic for PSIS diagnostics.


Convenience function for computing relative efficiencies

Description

relative_eff() computes the the MCMC effective sample size divided bythe total sample size.

Usage

relative_eff(x, ...)## Default S3 method:relative_eff(x, chain_id, ...)## S3 method for class 'matrix'relative_eff(x, chain_id, ..., cores = getOption("mc.cores", 1))## S3 method for class 'array'relative_eff(x, ..., cores = getOption("mc.cores", 1))## S3 method for class ''function''relative_eff(  x,  chain_id,  ...,  cores = getOption("mc.cores", 1),  data = NULL,  draws = NULL)## S3 method for class 'importance_sampling'relative_eff(x, ...)

Arguments

x

A vector, matrix, 3-D array, or function. See theMethods (byclass) section below for details on specifyingx, but where"log-likelihood" is mentioned replace it with one of the followingdepending on the use case:

  • For use with theloo() function, the values inx (or generated byx, if a function) should belikelihood values(i.e.,exp(log_lik)), not on the log scale.

  • For generic use withpsis(), the values inx should be the reciprocalof the importance ratios (i.e.,exp(-log_ratios)).

chain_id

A vector of lengthNROW(x) containing MCMC chainindexes for each each row ofx (if a matrix) or each value inx (if a vector). Nochain_id is needed ifx is a 3-Darray. If there areC chains then valid chain indexes are valuesin1:C.

cores

The number of cores to use for parallelization.

data,draws,...

Same as for theloo() function method.

Value

A vector of relative effective sample sizes.

Methods (by class)

Examples

LLarr <- example_loglik_array()LLmat <- example_loglik_matrix()dim(LLarr)dim(LLmat)rel_n_eff_1 <- relative_eff(exp(LLarr))rel_n_eff_2 <- relative_eff(exp(LLmat), chain_id = rep(1:2, each = 500))all.equal(rel_n_eff_1, rel_n_eff_2)

Standard importance sampling (SIS)

Description

Implementation of standard importance sampling (SIS).

Usage

sis(log_ratios, ...)## S3 method for class 'array'sis(log_ratios, ..., r_eff = NULL, cores = getOption("mc.cores", 1))## S3 method for class 'matrix'sis(log_ratios, ..., r_eff = NULL, cores = getOption("mc.cores", 1))## Default S3 method:sis(log_ratios, ..., r_eff = NULL)

Arguments

log_ratios

An array, matrix, or vector of importance ratios on the logscale (for Importance sampling LOO, these arenegative log-likelihoodvalues). See theMethods (by class) section below for a detaileddescription of how to specify the inputs for each method.

...

Arguments passed on to the various methods.

r_eff

Vector of relative effective sample size estimates containingone element per observation. The values provided should be the relativeeffective sample sizes of1/exp(log_ratios) (i.e.,1/ratios).This is related to the relative efficiency of estimating the normalizingterm in self-normalizing importance sampling. See therelative_eff()helper function for computingr_eff. If usingpsis withdraws of thelog_ratios not obtained from MCMC then the warningmessage thrown when not specifyingr_eff can be disabled bysettingr_eff toNA.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

Value

Thesis() methods return an object of class"sis",which is a named list with the following components:

log_weights

Vector or matrix of smoothed butunnormalized logweights. To get normalized weights use theweights() method provided for objects ofclasssis.

diagnostics

A named list containing one vector:

  • pareto_k: Not used insis, all set to 0.

  • n_eff: effective sample size estimates.

Objects of class"sis" also have the followingattributes:

norm_const_log

Vector of precomputed values ofcolLogSumExps(log_weights) that areused internally by theweights method to normalize the log weights.

r_eff

If specified, the user'sr_eff argument.

tail_len

Not used forsis.

dims

Integer vector of length 2 containingS (posterior sample size)andN (number of observations).

method

Method used for importance sampling, heresis.

Methods (by class)

References

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

See Also

Examples

log_ratios <- -1 * example_loglik_array()r_eff <- relative_eff(exp(-log_ratios))sis_result <- sis(log_ratios, r_eff = r_eff)str(sis_result)# extract smoothed weightslw <- weights(sis_result) # default args are log=TRUE, normalize=TRUEulw <- weights(sis_result, normalize=FALSE) # unnormalized log-weightsw <- weights(sis_result, log=FALSE) # normalized weights (not log-weights)uw <- weights(sis_result, log=FALSE, normalize = FALSE) # unnormalized weights

Truncated importance sampling (TIS)

Description

Implementation of truncated (self-normalized) importance sampling (TIS),truncated at S^(1/2) as recommended by Ionides (2008).

Usage

tis(log_ratios, ...)## S3 method for class 'array'tis(log_ratios, ..., r_eff = 1, cores = getOption("mc.cores", 1))## S3 method for class 'matrix'tis(log_ratios, ..., r_eff = 1, cores = getOption("mc.cores", 1))## Default S3 method:tis(log_ratios, ..., r_eff = 1)

Arguments

log_ratios

An array, matrix, or vector of importance ratios on the logscale (for Importance sampling LOO, these arenegative log-likelihoodvalues). See theMethods (by class) section below for a detaileddescription of how to specify the inputs for each method.

...

Arguments passed on to the various methods.

r_eff

Vector of relative effective sample size estimates containingone element per observation. The values provided should be the relativeeffective sample sizes of1/exp(log_ratios) (i.e.,1/ratios).This is related to the relative efficiency of estimating the normalizingterm in self-normalizing importance sampling. Ifr_eff is notprovided then the reported (T)IS effective sample sizes and Monte Carloerror estimates can be over-optimistic. If the posterior draws are (near)independent thenr_eff=1 can be used.r_eff has to be a scalar (samevalue is used for all observations) or a vector with length equal to thenumber of observations. The default value is 1. See therelative_eff()helper function for computingr_eff.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

Value

Thetis() methods return an object of class"tis",which is a named list with the following components:

log_weights

Vector or matrix of smoothed (and truncated) butunnormalized logweights. To get normalized weights use theweights() method provided for objects ofclasstis.

diagnostics

A named list containing one vector:

  • pareto_k: Not used intis, all set to 0.

  • n_eff: Effective sample size estimates.

Objects of class"tis" also have the followingattributes:

norm_const_log

Vector of precomputed values ofcolLogSumExps(log_weights) that areused internally by theweights()method to normalize the log weights.

r_eff

If specified, the user'sr_eff argument.

tail_len

Not used fortis.

dims

Integer vector of length 2 containingS (posterior sample size)andN (number of observations).

method

Method used for importance sampling, heretis.

Methods (by class)

References

Ionides, Edward L. (2008). Truncated importance sampling.Journal of Computational and Graphical Statistics 17(2): 295–311.

See Also

Examples

log_ratios <- -1 * example_loglik_array()r_eff <- relative_eff(exp(-log_ratios))tis_result <- tis(log_ratios, r_eff = r_eff)str(tis_result)# extract smoothed weightslw <- weights(tis_result) # default args are log=TRUE, normalize=TRUEulw <- weights(tis_result, normalize=FALSE) # unnormalized log-weightsw <- weights(tis_result, log=FALSE) # normalized weights (not log-weights)uw <- weights(tis_result, log=FALSE, normalize = FALSE) # unnormalized weights

Updatepsis_loo_ss objects

Description

Updatepsis_loo_ss objects

Usage

## S3 method for class 'psis_loo_ss'update(  object,  ...,  data = NULL,  draws = NULL,  observations = NULL,  r_eff = 1,  cores = getOption("mc.cores", 1),  loo_approximation = NULL,  loo_approximation_draws = NULL,  llgrad = NULL,  llhess = NULL)

Arguments

object

Apsis_loo_ss object to update.

...

Currently not used.

data,draws

Seeloo_subsample.function().

observations

The subsample observations to use. The argument can takefour (4) types of arguments:

  • NULL to use all observations. The algorithm then just usesstandardloo() orloo_approximate_posterior().

  • A single integer to specify the number of observations to be subsampled.

  • A vector of integers to provide the indices used to subset the data.These observations need to be subsampled with the same scheme as given bytheestimator argument.

  • Apsis_loo_ss object to use the same observations that were used in aprevious call toloo_subsample().

r_eff

Vector of relative effective sample size estimates for thelikelihood (exp(log_lik)) of each observation. This is related tothe relative efficiency of estimating the normalizing term inself-normalized importance sampling when using posterior draws obtainedwith MCMC. If MCMC draws are used andr_eff is not provided thenthe reported PSIS effective sample sizes and Monte Carlo error estimatescan be over-optimistic. If the posterior draws are (near) independent thenr_eff=1 can be used.r_eff has to be a scalar (same value is usedfor all observations) or a vector with length equal to the number ofobservations. The default value is 1. See therelative_eff() helperfunctions for help computingr_eff.

cores

The number of cores to use for parallelization. This defaults tothe optionmc.cores which can be set for an entire R session byoptions(mc.cores = NUMBER). The old optionloo.cores is nowdeprecated but will be given precedence overmc.cores untilloo.cores is removed in a future release.As of version2.0.0 the default is now 1 core ifmc.cores is not set, but werecommend using as many (or close to as many) cores as possible.

  • Note for Windows 10 users: it isstronglyrecommended to avoid usingthe.Rprofile file to setmc.cores (using thecores argument orsettingmc.cores interactively or in a script is fine).

loo_approximation

What type of approximation of the loo_i's should be used?The default is"plpd" (the log predictive density using the posterior expectation).There are six different methods implemented to approximate loo_i's(see the references for more details):

  • "plpd": uses the lpd based on point estimates (i.e.,p(y_i|\hat{\theta})).

  • "lpd": uses the lpds (i,e.,p(y_i|y)).

  • "tis": uses truncated importance sampling to approximate PSIS-LOO.

  • "waic": uses waic (i.e.,p(y_i|y) - p_{waic}).

  • "waic_grad_marginal": uses waic approximation using first order deltamethod and posterior marginal variances to approximatep_{waic} (ie.p(y_i|\hat{\theta})-p_waic_grad_marginal). Requires gradient oflikelihood function.

  • "waic_grad": uses waic approximation using first order delta method andposterior covariance to approximatep_{waic} (ie.p(y_i|\hat{\theta})-p_waic_grad). Requires gradient of likelihoodfunction.

  • "waic_hess": uses waic approximation using second order delta method andposterior covariance to approximatep_{waic} (ie.p(y_i|\hat{\theta})-p_waic_grad). Requires gradient and Hessian oflikelihood function.

As point estimates of\hat{\theta}, the posterior expectationsof the parameters are used.

loo_approximation_draws

The number of posterior draws used whenintegrating over the posterior. This is used ifloo_approximation is setto"lpd","waic", or"tis".

llgrad

The gradient of the log-likelihood. Thisis only used whenloo_approximation is"waic_grad","waic_grad_marginal", or"waic_hess". The default isNULL.

llhess

The Hessian of the log-likelihood. This is only usedwithloo_approximation = "waic_hess". The default isNULL.

Details

Ifobservations is updated then if a vector of indices or apsis_loo_ssobject is supplied the updated object will have exactly the observationsindicated by the vector orpsis_loo_ss object. If a single integer issupplied, new observations will be sampled to reach the supplied sample size.

Value

Apsis_loo_ss object.


Widely applicable information criterion (WAIC)

Description

Thewaic() methods can be used to compute WAIC from the pointwiselog-likelihood. However, we recommend LOO-CV using PSIS (as implemented bytheloo() function) because PSIS provides useful diagnostics as well aseffective sample size and Monte Carlo estimates.

Usage

waic(x, ...)## S3 method for class 'array'waic(x, ...)## S3 method for class 'matrix'waic(x, ...)## S3 method for class ''function''waic(x, ..., data = NULL, draws = NULL)is.waic(x)

Arguments

x

A log-likelihood array, matrix, or function. TheMethods (by class)section, below, has detailed descriptions of how to specify the inputs foreach method.

draws,data,...

For the function method only. See theMethods (by class) section below for details on these arguments.

Value

A named list (of classc("waic", "loo")) with components:

estimates

A matrix with two columns ("Estimate","SE") and threerows ("elpd_waic","p_waic","waic"). This containspoint estimates and standard errors of the expected log pointwise predictivedensity (elpd_waic), the effective number of parameters(p_waic) and the information criterionwaic (which is just-2 * elpd_waic, i.e., converted to deviance scale).

pointwise

A matrix with three columns (and number of rows equal to the number ofobservations) containing the pointwise contributions of each of the abovemeasures (elpd_waic,p_waic,waic).

Methods (by class)

References

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation andwidely application information criterion in singular learning theory.Journal of Machine Learning Research11, 3571-3594.

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian modelevaluation using leave-one-out cross-validation and WAIC.Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4(journal version,preprint arXiv:1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).Pareto smoothed importance sampling.Journal of Machine Learning Research,25(72):1-58.PDF

See Also

Examples

### Array and matrix methodsLLarr <- example_loglik_array()dim(LLarr)LLmat <- example_loglik_matrix()dim(LLmat)waic_arr <- waic(LLarr)waic_mat <- waic(LLmat)identical(waic_arr, waic_mat)## Not run: log_lik1 <- extract_log_lik(stanfit1)log_lik2 <- extract_log_lik(stanfit2)(waic1 <- waic(log_lik1))(waic2 <- waic(log_lik2))print(compare(waic1, waic2), digits = 2)## End(Not run)

Extract importance sampling weights

Description

Extract importance sampling weights

Usage

## S3 method for class 'importance_sampling'weights(object, ..., log = TRUE, normalize = TRUE)

Arguments

object

An object returned bypsis(),tis(), orsis().

...

Ignored.

log

Should the weights be returned on the log scale? Defaults toTRUE.

normalize

Should the weights be normalized? Defaults toTRUE.

Value

Theweights() method returns an object with the same dimensions asthelog_weights component ofobject. Thenormalize andlogarguments control whether the returned weights are normalized and whetheror not to return them on the log scale.

Examples

# See the examples at help("psis")

[8]ページ先頭

©2009-2025 Movatter.jp