Movatterモバイル変換

This vignette is written forbaggr users who want to learnabout (Bayesian) meta-analysis concepts and analyse data on typicallycontinuous variables. If you are looking for information that is morespecific to binary data, please readvignette("baggr_binary"). This article - like the packageitself - is still under construction. We encourage your feedback.

baggr (pronounced “bagger” or “badger” and short forBayesian Aggregator) is a package for aggregating evidence on causaleffects measured in several separate and different instances. Theseinstances may be different studies, groups, locations or “sites” howeverconceptualised. We refer to these separate pieces of evidence as“groups” for the remainder of this vignette When each group is a study,the model is that ofmeta-analysis, but aggregation of evidenceis not limited to this case.

One of the most basic objects of interest is the average treatmenteffect (ATE), the difference in the mean outcome in treatment andcontrol groups; for more information see work byRubin (1974). In meta-analysis we are ofteninterested in the average of this average effect across groups,estimated using all the evidence from all the groups. Consider the casewhere the evidence in each study or group is generated by comparing theoutcomes of treatment and control samples in a randomized experiment. Wewill ignore any covariate information at the individual or group levelfor now.

Consider some outcome of interest$y_{ik}$ such as consumption, income orhealth outcomes for a household or individual$i = 1,2,...N_k$ in study group$k = 1,2....K$. Let$Y_k$ denote the$N_k$-length vector of observed outcomesfrom group$k$. Denote the binaryindicator of treatment status by$T_{ik}$, and denote by$T_k$ the$N_k$-length vector of all treatment statusindicators from group$k$.

Suppose that$y_{ik}$ variesrandomly around its mean$\mu_k + \tau_kT_i$. In this setting$\tau_k$ is the treatment effect in group$k$. The random variation in$y_{ik}$ may be the result of samplingvariation or measurement error, as in theRubin(1981) model, or it may be the result of unmodelled heterogeneityor uncertainty in outcomes for individuals within the group. Allow thevariance of the outcome variable$y_{ik}$ to vary across sites, so$\sigma_{y_k}^2$ may differ across$k$.

Data inputs: reported effects or full individual-level datasets

For average effects aggregation,baggr allows 3 types ofdata inputs. The user may supply, within a data frame environment, anyof the following:

A set of estimated treatment effects$\{\hat{\tau_k}\}_{k=1}^{K}$ and theirstandard errors$\{\hat{se_k}\}_{k=1}^{K}$ from each study.This should be formatted as two column vectors of length$K$ within the data frame, where$\hat{\tau_k}$ is the$k$-th entry of the treatment effect vectorand$\hat{se_k}$ is the$k$-th entry of the standard errors vector.Columns should be named"tau" and"se". Modelwill be"rubin" (see below).
A set of control group means and estimated treatment effects$\{\hat{\mu_k},\hat{\tau_k}\}_{k=1}^{K}$,as well as the standard errors for both$\{\hat{se}_{\mu k}, \hat{se}_{\tauk}\}_{k=1}^{K}$, for each study site This should be formatted asfour vectors of length$K$ within thedata frame, analogous to the above. Columns should be names"mu","tau","se.mu","se.tau". Model will be"mutau" (seebelow).
The full data sets from all the original studies$\{Y_k, T_k\}_{k=1}^{K}$. This should beformatted as three vectors of length$\sum_{k=1}^K N_{k}$, which we recommendnaming"outcome","treatment","group" (for site indicators), but the names can also bespecified when callingbaggr() function. Model will be"rubin_full" (see below).

As an example of an individual-level data set we include in dataframesmicrocredit andmicrocredit_simplified.The former contains all the microcredit outcome data used inMeager (2019), standardized to USD PPP in 2009terms per two weeks (a time period is necessary as these are flowvariables). It therefore contains NAs and other undesirable features, toallow the user to see howbaggr handles these common dataissues. The data setmicrocredit_simplified has theseissues cleaned up and contains only one outcome of interest, consumerdurables spending.

baggr also has a function that automatically producessummary data from full data sets, in case one wishes to run thecomparatively faster summary-data models. Theprepare_ma()function applied to a dataframe with columns named"group","outcome", and"treatment" automaticallyestimates the control group means, treatment effects, and associatedstandard errors for each group using an Ordinary Least Squaresregression. The resulting output is already formatted as a valid inputto thebaggr() command itself:

prepare_ma(microcredit_simplified,outcome ="consumption")#>   group       mu       tau     se.mu    se.tau n.mu n.tau#> 1     1 303.6065  5.510755  2.559897  4.101140 8298  8262#> 2     2 280.0887 50.449393 11.141075 22.156317  260   701#> 3     3 196.4215 -5.171338 14.432604 19.266339  444   551#> 4     4 276.2791  4.641604  3.730907  5.451094 3248  3579#> 5     5 327.5246 -2.935731  4.027768  6.022955 2771  2716

ATE aggregation models inbaggr

baggr currently contains two different models suitable foraggregating sets of average treatment effects. Consider first theevidence aggregation model fromRubin(1981), discussed extensively in Chapter 5 ofGelman et al. (2013); the model consists of ahierarchical likelihood as follows:

\[\begin{equation}\begin{aligned}\hat{\tau_k} &\sim N(\tau_k, \hat{se_k}^2) \; \forall \; k \\\tau_k &\sim N(\tau, \sigma_{\tau}^2) \; \forall \; k .\end{aligned}\label{rubin model}\end{equation}\]

The motivation for this model structure is discussed in detail in thesources above and inMeager (2019). Tocomplete the Bayesian model, we now need priors.baggr has aset of default priors for each model (adjusted to data), as well asallowing the user to specify her own priors if desired. In the Rubinmodel,baggr’s default priors on the hyper-parameters are asfollows:

For$\tau$, the prior is$\mathcal{N}(0, 100)$. This is a very weakprior which does little regularization as a default, centered at zero.It assumes that causal effects should not be thought of as large unlessdata contains evidence to the contrary (a “hand-wavey” form of Occam’sRazor).
For$\sigma_{\tau}$ the prior is$\textrm{Uniform}(0,10\tilde{\sigma})$,where$\tilde{\sigma}$ is a naivestandard deviance estimator for$\{\hat{\tau_k}\}_{k=1}^{K}$ (generated bythe R commandsd()).

In case you also have data on the control groups’ mean outcomes andthe uncertainty on those,you can augment theRubin (1981) model to incorporate thatinformation. FollowingMeager (2019), ifone has access to the estimated control means$\{\hat{\mu_k}\}^K_{k=1}$ and theirstandard errors$\{\hat{se}_{\muk}\}^K_{k=1}$, one can fit a joint Gaussian model on the pairs$\{\hat{\mu_k},\hat{\tau_k}\}_{k=1}^{K}$:

\[\begin{equation}\begin{aligned}\hat{\tau_k} &\sim N(\tau_k, \hat{se_{\tau k}}^2) \; \forall \; k \\\hat{\mu_k} &\sim N(\mu_k, \hat{se_{\mu k}}^2) \; \forall \; k \\\left( \begin{array}{c}\mu_{k}\\\tau_{k}\end{array} \right)&\simN\left( \left(\begin{array}{c}\mu\\\tau\end{array} \right), V \right) \; \text{where} \;V = \left[\begin{array}{cc} \sigma^2_{\mu} & \sigma_{\tau\mu} \\\sigma_{\tau\mu} & \sigma_{\tau}^2 \end{array} \right]\forall \; k.\\\end{aligned}\label{full data model}\end{equation}\]

Inbaggr this model is referred to as"mutau".

If you have only few groups, the priors on$V$ will need to be relatively strong toavoid overfitting. SeeMeager (2019) formore discussion of this issue in particular, or see theStanManual on hierarchical priors. The default priors are asfollows:

$V$ is decomposed into acorrelation matrix$\Omega$, whichreceives an LKJCorr(3) prior regularizing it towards independence, and ascale parameter$\theta$ whichreceives a$\textrm{Cauchy}(0,s)$prior, with$s$ set to either 10times the empirical SD of$\mu$ of of$\tau$, whichever is higher.
The default prior for$(\mu,\tau)$ is bivariate Gaussian, with zero means and a diagonalcovariance matrix with the diagonal (variances) set to 100 times theempirical maximum of$\mu$ and$\tau$.

Models, their inputs, likelihood and priors: a summary table

Model (name in`baggr`)	Input columns	Level-1 likelihood	Level-2 likelihood	Default priors
summary data “Rubin” (`"rubin"`)	`tau` and`se`	$\hat{\tau_k} \sim N(\tau_k,\hat{se_k}^2)$	$\tau_k \sim N(\tau,\sigma_{\tau}^2)$	$\tau \sim \mathcal{N}(0, 100)$,$\sigma_{\tau} \sim \mathcal{U}(0,10\tilde{\sigma})$
“$\mu$ and$\tau$” (`"mutau"`)	`tau`,`mu`,`se.tau`,`se.mu`	$\hat{\tau_k} \sim N(\tau_k,\hat{se_{\tau,k}}^2)$,$\hat{\mu_k}\sim N(\mu_k, \hat{se_{\mu,k}}^2)$	$\pmatrix{\mu_k \\ \tau_k} \simN(\pmatrix{\mu \\ \tau}, V)$	$V = \theta \Omega \theta'$where$\theta \sim Cauchy(0,10)$,$\Omega \sim LKJ(3)$,$\pmatrix{\mu \\ \tau} \sim N(0,100^2Id_2)$
full data “Rubin” (`"rubin_full"`)	`outcome`,`treatment`,`group`	Same as for “$\mu$ and$\tau$”	Same as for “$\mu$ and$\tau$”	Same as for “$\mu$ and$\tau$”
Logit (`"logit"`)	`outcome`,`treatment`,`group`	See`vignette("baggr_binary")`	…	…

Where inputs are individual-level data, i.e. outcome,treatment,group, you can specify column namesin as arguments tobaggr() function.

Prior choice in baggr

In the “Rubin” and “$\mu$ and$\tau$” models, the user can specifycustom priors beyond the defaults using the prior arguments. These priorarguments are subdivided into 3 categories:

Priors for the hypermean, the average effect across groups, suchas$\tau$ or the vector$(\mu, \tau)$. This is denoted using theprior_hypermean argument in baggr.
Priors for the hyper-standard-deviations$\sigma_{\tau}$, the standard deviation ofeffects across groups. This also refers to the prior on$\theta$, the scale parameter of thehyper-variance-covariance matrix$V$in the”$\mu$ and$\tau$” model. This type of parameter isdenotedprior_hypersd in baggr.
Priors on hypercorrelations between parameters that may co-varyacross the groups. When working with multi-dimensional parameters at theupper level of the model (aka performing multivariate shrinkage), thisis the correlation in the distribution of the different parametersacross the groups. For example, in the “$\mu$ and$\tau$” model this parameter is thecorrelation matrix$\Omega$ thatgoverns the hyper-variance-covariance matrix$V$. This type of parameter is denotedprior_hypercor in baggr.

The possible prior distributions we allow for in the current versionare:

Forprior_hypermean we allow"normal","uniform","cauchy","lognormal",and (generalised)"student_t" with any parameter valueswhich are logically possible given support constraints (e.g. user cannotspecify a negative variance on a normal distribution or a negative scaleon a Cauchy) When the model has a vector hypermean, “$\mu$ and$\tau$” model, baggr applies the givenprior to all the elements of the vector independently. If the userwishes to specify a prior dependence between the components, one cansupply prior_hypermean with a multinormal argument (see?priors for details).
Forprior_hypersd we allow"normal" and"uniform" with any parameter values which are logicallypossible given support constraints (as above).
Forprior_hypercor, which is only necessary when thereis a multivariate shrinkage operation (as in the “$\mu$ and$\tau$” model) we allow an LKJ correlationprior ("lkj") with any parameter values which are logicallypossible given support constraints. For further details of this strategysee Meager (2019).

Notation for priors is “plain-text”, in that you can write thedistributions asnormal(),uniform() etc. See?priors for details, or continue to the example below withthe Rubin model.

Running the Rubin Model inbaggr

To demonstrate the Rubin model inbaggr, consider the 8schools example from Rubin (1981). In this dataset, 8 schools in theUnited States performed similar randomized experiments to estimate thecausal effect of an SAT tutoring program on learning outcomes. Reportedtreatment effects and their standard errors are included inbaggr as a data frame:

schools#>      group tau se#> 1 School A  28 15#> 2 School B   8 10#> 3 School C  -3 16#> 4 School D   7 11#> 5 School E  -1  9#> 6 School F   1 11#> 7 School G  18 10#> 8 School H  12 18

To fit the model inbaggr (having followed the installationinstructions and loaded the package):

baggr_schools<-baggr(schools,model ="rubin",pooling ="partial")

This creates abaggr class object, and you can accessthe underlyingstanfit object by callingbaggr_schools$fit. If you don’t change the default priors,thenbaggr will print a message informing you of the priors ithas chosen.

Printingbaggr_schools returns a summary of theposterior inference. Firstbaggr records the model type and thepooling regime chosen by the user or implemented by default. Second,baggr returns inference on the aggregate treatment effect$\tau$ by reporting its posterior mean and95% uncertainty interval, and similar inference on the hyper-SD$\sigma_{\tau}$. Lastly, it prints the“updated” inference on each of the groups’ treatment effects, displayingtheir new posterior means, standard deviations and pooling factors (seebelow).

print(baggr_schools)#> Model type: Rubin model with aggregate data#> Pooling of effects: partial#>#> Aggregate treatment effect (on mean), 8 groups:#> Hypermean (tau) =  8.1 with 95% interval -1.8 to 18.0#> Hyper-SD (sigma_tau) = 6.61 with 95% interval 0.29 to 20.31#> Posterior predictive effect = 8.4 with 95% interval -11.9 to 28.6#> Total pooling (1 - I^2) = 0.76 with 95% interval 0.25 to 1.00#>#> Group-specific treatment effects:#>          mean  sd   2.5%  50% 97.5% pooling#> School A 11.3 8.1  -2.26 10.3    31    0.82#> School B  8.0 6.3  -4.90  7.9    21    0.72#> School C  6.3 7.7 -10.92  6.8    21    0.84#> School D  7.7 6.5  -5.88  7.6    21    0.75#> School E  5.1 6.2  -8.72  5.6    16    0.69#> School F  6.2 6.7  -8.75  6.5    19    0.75#> School G 10.8 6.8  -0.89 10.2    26    0.72#> School H  8.6 7.9  -6.64  8.2    26    0.86

Choosing priors

It is possible to fit the Rubin model without specifying any priors,in which case the user is notified about the automatic prior choice thatbaggr performs. But the priors can also be easily customised byusingprior_ arguments tobaggr(). The Rubinmodel performs univariate shrinkage so we will not need to specify ahyper-correlation prior, but we can specify some custom priors on thehyper-mean and hyper-sd. If desired, the user can specify only somepriors as custom distributions - the rest will be chosen automaticallyand the user will be notified of this in the output.

Consider changing both as an example, say placing a Normal prior withmean -5 and standard deviation 10 on our hypermean$\tau$, as well as placing a uniform priorwith lower bound 0 and upper bound 5 on our hyper-standard-deviation$\sigma_{\tau}$. Thus, what isexpressed mathematically as$\tau \simN(-5,10)$ and$\sigma_{\tau} \simU[0,5]$ is expressed in baggr as follows:

baggr(schools,"rubin",prior_hypermean =normal(-5,10),prior_hypersd   =uniform(0,5))

It is also possible to pass your custom priors as a list to baggr, asfollows:

custom_priors<-list(hypermean =cauchy(0,25),hypersd =normal(0,30))baggr(schools,"rubin",pooling ="partial",prior = custom_priors)

Note that the Rubin model assumes Gaussian distribution of effectsacross groups. This is generally appropriate as a first pass at theproblem (seeMcCulloch and Neuhaus (2011))except if the distribution is known to be asymmetric forscientific reasons. For example, if you are working with risk ratios orodds ratios these statistics cannot be negative and the chosenhyper-distribution should typically reflect that. However, in many casesit is possible and indeed standard to maintain the Gaussian assumptionon a transform of the object: for example, you can safely fit the Rubinmodel to the logarithm of the risk ratios or odds ratios. While bearingin mind that log transforms obscure inherent dependencies between meansand variances in the raw scale, this is still much better than applyinga Gaussian to the raw object itself.

Understanding and criticisingbaggr model performance

Baggr models are run in Stan, and the fit and results needto be checked, understood and criticised as you would any stan model orindeed any MCMC model fitting exercise.You must pay attentionto printed warnings about the Rhat criterion: if you see a warning thatRhat statistic exceeds 1.05 for any parameter, you MUST NOT use theresults for inference. This warning means the MCMC chains havenot converged, and it is exceedingly unlikely that the “posteriorinference” printed out corresponds to anything close to the trueposterior implied by your model and data.If you use resultsfrom which the Rhat statistic exceeds 1.05 YOUR INFERENCE WILL BEWRONG.

If you see this warning, try re-running the model with the optioniter set to a large number such as 10,000, as below. It isalso good practice to run many chains, such as 8 rather than the default4, to have a greater chance to detect pathological MCMC behaviour. Youdo this by passingbaggr the stan argumentsiter = 10000 andchains = 8, like so:

baggr_schools<-baggr(schools,model ="rubin",pooling ="partial",iter =10000,chains =8)

Other warnings you may see involve “divergent transitions”. While notas serious as high Rhat, this can signal problems with the model. As thestan message that you will see suggests, try adjustingadapt_delta argument above 0.8, e.g. to 0.99. You cannotpass this parameter directly tostan and thus you cannot passit directly to baggr, so instead you must pass the argumentcontrol = list(adapt_delta = 0.99).

Measuring “pooling”

It is often useful to measure the extent to which the hierarchicalmodel is “pooling” or sharing information across the groups in theprocess of aggregation.Baggr automatically computes and printssuch a metric, as seen above. You can access more details by writingpooling(baggr_schools) orheterogeneity(baggr_schools).

Estimate of pooling in each group

In the output above we can see a “pooling” column next to each group.This is a statistic due toGelman and Pardoe(2006).

In a partial pooling model (see [baggr]), group$k$ (e.g. study) has a treatment effectestimate, with some SE around the real$\tau_k$, i.e.$\hat{\tau_k} \sim \mathcal{N}(\tau_k,\hat{se_k})$ ( itself is an estimate of the true$se_k$). Each$\tau_k$ itself is distributed with mean$\tau$ and variance$\sigma_{\tau}^2$.

The quantity of interest is ratio of variability in$\tau$ to total variability. By convention,we subtract it from 1, to obtain apooling metric$p$.

\[p = 1 -\frac{\sigma_{\tau}^2}{\sigma_{\tau}^2 + se_k^2}\]

If$p < 0.5$, that means thevariation across studies is higher than variation within studies.
Values close to 1 indicate nearly full pooling. Variation acrossstudies dominates.
Values close to 0 – no pooling. Variation within studiesdominates.

Note that, since$\sigma_{\tau}^2$is a Bayesian parameter (rather than a single fixed value)$p$ is also a parameter. It is typical for$p$ to have very high dispersion, asin many cases we cannot precisely estimate$\sigma_{\tau}$. That is certainly the casein our example:

pooling(baggr_schools)#> , , 1#>#>            [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]#> 2.5%  0.3528756 0.1950769 0.3828791 0.2267538 0.1640944 0.2267538 0.1950769#> mean  0.8234212 0.7197472 0.8374443 0.7462999 0.6891478 0.7462999 0.7197472#> 97.5% 0.9996293 0.9991663 0.9996742 0.9993109 0.9989710 0.9993109 0.9991663#>            [,8]#> 2.5%  0.4398478#> mean  0.8611129#> 97.5% 0.9997425

Overall pooling (in the model)

Sometimes researchers prefer to summarise heterogeneity by using asingle measure. One possibility is to provide a “big” estimate ofpooling$P$ is analogous to averaging$p$ across groups:

\[P = 1 -\frac{\sigma_{\tau}^2}{\sigma_{\tau}^2 + \text{E}(se_k^2)}\]

where is average over$K$ groups.Note that the denominator in the formula above is an application of thelaw of total variance to$\hat{\tau_k}$, i.e.$\text{Var}(\hat{\tau_k})$ is a sum ofbetween-study variance ($\text{Var}({\tau_k})$) and averagewithin-study variance ($\text{E}(se_k^2)$);von Hippel (2015) provides more details.

In many contexts, i.e. medical statistics, it is typical to report$1-P$, called$I^2$ (seeHigginset al. (2003) for an overview). Higher values ofI-squared indicate higher heterogeneity.

Same as for group-specific estimates,$P$ is a Bayesian parameter and itsdispersion can be high.

Plotting and model comparison inbaggr

A fundamental step to understanding the model is to plot theposterior distributions.baggr has several automatic plotfunctions which you can access by callingbaggr_plot() or,equivalently, using the default plot function; these visuals are basedonbayesplot package. Plotting functions always takebaggr class object as their first argument. By default,means and 95% posterior intervals of the effectsin each groupare shown. Extra options are available, such as whether to order theresults by effect size. For the 8 schools Rubin model we have

plot(baggr_schools,order =FALSE)

Similarly important is the plot of potential treatment effects(posterior predictive distribution), obtained by repeatedly drawinghypermean and hyper-SD values from baggr’s Markov Chain model, thendrawing new value of treatment effect conditional on these:

effect_plot(baggr_schools)

The values underlying the plot can be obtained byeffect_draw, which can be used e.g. to draw 1 newvalue:

effect_draw(baggr_schools,draws =1)#> [1] 3.574462

You can also summarise over the entire distribution to obtain mean,SD, and quantiles(effect_draw(baggr_schools, summary = TRUE)). Formeta-regression models (models with covariates), you can also use thisfunction to make predictions for new data, at set values ofcovariates.

However, from the viewpoint of model building, the most importantplots are the ones that directly compare many possible models.

Basic model comparison withbaggr_compare

The default Rubin model (which we have selected explicitly above) isthat of partial pooling. When usingbaggr_compare withoutany extra arguments, full pooling, no pooling and partial poolingversions of the model will be fit:

my_baggr_comparison<-baggr_compare(schools)

The result of the comparison includes all the models, but it alsoproduces an automatic comparison plot. Because the output object ofbaggr_compare is aggplot object you can edit it furtheryourself and build on it as you would anyggplot object by,for example, changing theme or labels:

plot(my_baggr_comparison)+ggtitle("8 schools: model comparison")

Comparing existing models, understanding impact of the prior

The comparison can be made not just for different types of pooling,but for any number of models which include the same groups. A typicalexample is comparing two models on the same data, but with differentpriors.

Consider this model with alternative, very strong priors:

baggr_schools_v2<-baggr(schools,prior_hypermean =normal(10,2.5))

We can compare it with the previous model by providing both models asarguments tobaggr_compare andeffects_plot.It’s good to name the arguments, as these names will be used to labelthem:

effect_plot("Default model"= baggr_schools,"normal(10, 2.5)"= baggr_schools_v2)+coord_cartesian(xlim =c(-10,30))+theme(legend.position ="top")

baggr_compare("Default model"= baggr_schools,"normal(10, 2.5)"= baggr_schools_v2)#>#> Mean treatment effects:#>                     2.5%    mean   97.5%  median      sd#> Default model   -1.80169 8.06995 18.0312 8.09241 5.04449#> normal(10, 2.5)  5.26535 9.57988 13.9799 9.54550 2.21677#>#> SD for treatment effects:#>                     2.5%    mean   97.5%  median      sd#> Default model   0.288856 6.61242 20.3130 5.31352 5.52016#> normal(10, 2.5) 0.260380 6.04686 18.1959 4.89995 4.83284#>#> Posterior predictive effects:#>                      2.5%    mean   97.5%  median      sd#> Default model   -11.02010 8.03149 28.9174 7.79236 9.85001#> normal(10, 2.5)  -6.98111 9.55352 26.2898 9.40361 7.95353

Forest plots for the models

Forest plots are a typical for reporting meta-analysis results. Weadapted the default functionality fromforestplot to work withbaggr objects. You can choose to display 1) input data forgroups, and/or, 2) posterior distributions for groups. In both casesthese are followed by a display mean treatment effect. The default is1):

forest_plot(baggr_schools)

The plots can be modified by passing extra arguments – see?forestplot. It’s also possible to display both 1) and2):

forest_plot(baggr_schools,show ="both")

Cross-validation inbaggr

Baggr has built-in, automated leave-one-out cross validationfor its models. The values returned byloocv() can be usedto understand how any one group affects the overall result, as well ashow well the model predicts the omitted group.

This function automatically runs$K$ baggr models, leaving out one group ata time, and then calculates expected log predictive density (ELPD) forthat group (seeGelman, Hwang, and Vehtari(2014)). The main output is the cross-validation informationcriterion, or -2 times the ELPD averaged over$K$ models. This is related to, and oftenapproximated by, the Watanabe-Akaike Information Criterion. A valuecloser to zero (i.e. a smaller number in magnitude) means a better fit.For more information on cross-validation we recommend readingGelman, Hwang, and Vehtari (2014).

Therefore, if you have$K$ groups,using theloocv() will run your model of choice$K$ times. Be aware that this may take awhile even for simple models. (You should see a progress bar in yourterminal window.) Theloocv() function takes in the samearguments asbaggr(), plus an option(return_models) for whether to return all the models orjust the summary statistics. For the 8 schools example we can do

loocv_res<-loocv(schools,return_models =FALSE,iter =1000,#just to make it a bit faster -- don't try it at home!model ="rubin",pooling ="partial")

loocv_res#> LOO estimate based on 8-fold cross-validation#>#>       Estimate Standard Error#> elpd     -31.8          0.991#> looic     63.6          1.980

Theloocv() output contains more than just the matrix itprints, and this additional information can be accessed viaattributes(), e.g. the mean treatment effects, theirvariability andelpd for each model that are stored in theattributedf:

names(attributes(loocv_res))#> [1] "names" "class"attr(loocv_res,"df")#> NULL

This data frame can then be used to examine or compute the variationin the inference on$\tau$ in theabsence of each group. If the user is interested in manually checkingthe consequences of excluding a particular group or set of groups, thisis also possible inbaggr using subsetting. For example,suppose that we want to run the Rubin model on school groups 1-7 andpredict the effect in the 8th school. The code below shows how you canspecify a subset of the dataframe as your “data” argument, and thendesignate another subset as the “testing” holdout set by assigning it tothe argument “test_data” in the baggr command. Here we have done it forboth partial and full pooling:

We can compare the performance of the two models using the mean logpredictive density. This itself is a density as the name suggests so wewill compute the log expected value of this density: here, as before, anumber closer to zero is better. In this case the full pooling modelactually does slightly better:

fit1$mean_lpd#> [1] 7.94258fit2$mean_lpd#> [1] 7.739746

If we run the fullloocv for both models, we canestimate the difference between the ELPD of those models as well as thestandard error of that difference. As an example, let’s consider the twomodels above.

loocv_full<-loocv(data = schools,model ="rubin",pooling ="full")

We can compare those fits withloo_compare, which givesuse the differences in ELPD for the various models we pass it. Generallyit is important to pay attention toboth the difference in thepredictive density and also the standard error of that difference.

loo_compare(loocv_res, loocv_full)#> Comparison of cross-validation#>#>                     ELPD ELPD SE#> Model 1 - Model 2 -0.742   0.308#>#> Positive ELPD indicates the reference group is preferred.