Movatterモバイル変換

Type:

Package

Title:

Bayesian Synthetic Likelihood

Version:

3.2.5

Date:

2022-11-02

Description:

Bayesian synthetic likelihood (BSL, Price et al. (2018) <doi:10.1080/10618600.2017.1302882>) is an alternative to standard, non-parametric approximate Bayesian computation (ABC). BSL assumes a multivariate normal distribution for the summary statistic likelihood and it is suitable when the distribution of the model summary statistics is sufficiently regular. This package provides a Metropolis Hastings Markov chain Monte Carlo implementation of four methods (BSL, uBSL, semiBSL and BSLmisspec) and two shrinkage estimators (graphical lasso and Warton's estimator).uBSL (Price et al. (2018) <doi:10.1080/10618600.2017.1302882>) uses an unbiased estimator to the normal density. A semi-parametric version of BSL (semiBSL, An et al. (2018) <doi:10.48550/arXiv.1809.05800>) is more robust to non-normal summary statistics. BSLmisspec (Frazier et al. 2019 <doi:10.48550/arXiv.1904.04551>) estimates the Gaussian synthetic likelihood whilst acknowledging that there may be incompatibility between the model and the observed summary statistic. Shrinkage estimation can help to decrease thenumber of model simulations when the dimension of the summary statistic is high (e.g., BSLasso, An et al. (2019) <doi:10.1080/10618600.2018.1537928>). Extensions to this package are planned. For a journal article describing howto use this package, see An et al. (2022) <doi:10.18637/jss.v101.i11>.

Depends:

R (≥ 3.3.0)

License:

GPL-2 |GPL-3 [expanded from: GPL (≥ 2)]

LazyLoad:

yes

Imports:

glasso, ggplot2, MASS, mvtnorm, copula, whitening, graphics,gridExtra, foreach, coda, Rcpp, doRNG, methods, stringr, Rdpack(≥ 0.7)

Suggests:

elliplot, doParallel, rbenchmark, mixtools

LinkingTo:

RcppArmadillo, Rcpp

LazyData:

true

RoxygenNote:

7.1.1

Encoding:

UTF-8

Collate:

'BSL-package.R' 'RcppExports.R' 's4-MODEL.R' 's4-BSL.R''bsl.R' 'cell.R' 'combinePlotsBSL.R' 'covWarton.R''estimateLoglike.R' 'estimateWhiteningMatrix.R''gaussianRankCorr.R' 'gaussianSynLike.R''gaussianSynLikeGhuryeOlkin.R' 'imports.R' 'kernelCDF.R''logitTransform.R' 'ma2.R' 'mgnk.R' 'myMiniProgressBar.R''s4-PENALTY.R' 'selectPenalty.R' 'semiparaKernelEstimate.R''sliceGammaMean.R' 'sliceGammaVariance.R' 'synLikeMisspec.R''toad.R'

RdMacros:

Rdpack

NeedsCompilation:

yes

Packaged:

2022-11-02 05:28:19 UTC; southl

Author:

Ziwen An

[aut], Leah F. South

[aut, cre], Christopher C. Drovandi

[aut]

Maintainer:

Leah F. South <l1.south@qut.edu.au>

Repository:

CRAN

Date/Publication:

2022-11-03 09:00:07 UTC

Bayesian synthetic likelihood

Description

Bayesian synthetic likelihood (BSL,Price et al. (2018)) is an alternative to standard,non-parametric approximate Bayesian computation (ABC). BSL assumes amultivariate normal distribution for the summary statistic likelihood and itis suitable when the distribution of the model summary statistics issufficiently regular.

In this package, a Metropolis Hastings Markov chain Monte Carlo (MH-MCMC)implementation of BSL is available. We also include implementations of fourmethods (BSL, uBSL, semiBSL and BSLmisspec) and two shrinkage estimators(graphical lasso and Warton's estimator).

Methods: (1) BSL (Price et al. 2018), which is the standard form ofBayesian synthetic likelihood, assumes the summary statistic is roughlymultivariate normal; (2) uBSL (Price et al. 2018), which uses anunbiased estimator to the normal density; (3) semiBSL(An et al. 2019), which relaxes the normality assumption to anextent and maintains the computational advantages of BSL without any tuning;and (4) BSLmisspec (Frazier and Drovandi 2021), which estimates theGaussian synthetic likelihood whilst acknowledging that there may beincompatibility between the model and the observed summary statistic.

Shrinkage estimators are designed particularly to reduce the number ofsimulations if method is BSL or semiBSL: (1) graphical lasso(Friedman et al. 2008) finds a sparse precision matrix with anL1-regularised log-likelihood. An et al. (2019) usegraphical lasso within BSL to bring down the number of simulationssignificantly when the dimension of the summary statistic is high; and (2)Warton's estimator (Warton 2008) penalises the correlationmatrix and is straightforward to compute. When using the Warton's shrinkageestimator, it is also possible to utilise the Whitening transformation(Kessy et al. 2018) to help decorrelate the summary statsitics, thusencouraging sparsity of the synthetic likelihood covariance matrix.

Parallel computing is supported through theforeach package and userscan specify their own parallel backend by using packages likedoParallel ordoMC. Then model simulations required toestimate the synthetic likelihood at each iteration of MCMC will bedistributed across multiple cores. Alternatively a vectorised simulationfunction that simultaneously generatesn model simulations is alsosupported.

The main functionality is available through:

bsl: The general function to perform BSL,uBSL, or semiBSL (with or without parallel computing).
selectPenalty: A function to select the penalty when usingshrinkage estimation within BSL or semiBSL.

Several examples have also been included. These examples can be used toreproduce the results of An et al. (2019), and can help practitioners learnhow to use the package.

ma2: The MA(2) example fromAn et al. (2019).
mgnk: The multivariate G&K example fromAn et al. (2019).
cell: The cell biology example fromPrice et al. (2018) and An et al. (2019).
toad: The toad example fromMarchand et al. (2017), and also considered inAn et al. (2019).

Extensions to this package are planned. For a journal article describing howto use this package, including full descriptions on the MA(2) and toad examples,see An et al. (2022).

Author(s)

Ziwen An, Leah F. South and Christopher Drovandi

References

An Z, Nott DJ, Drovandi C (2019).“Robust Bayesian Synthetic Likelihood via a Semi-Parametric Approach.”Statistics and Computing (In Press).

An Z, South LF, Drovandi CC (2022).“BSL: An R Package for Efficient Parameter Estimation for Simulation-Based Models via Bayesian Synthetic Likelihood.”Journal of Statistical Software,101(11), 1–33.doi:10.18637/jss.v101.i11.

An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.

Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.

Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.

Kessy A, Lewin A, Strimmer K (2018).“Optimal Whitening and Decorrelation.”The American Statistician,72(4), 309–314.doi:10.1080/00031305.2016.1277159.

Marchand P, Boenke M, Green DM (2017).“A stochastic movement model reproduces patterns of site fidelity and long-distance dispersal in a population of Fowlers toads (Anaxyrus fowleri).”Ecological Modelling,360, 63–69.ISSN 0304-3800, doi:10.1016/j.ecolmodel.2017.06.025.

Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.

Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.

S4 class “BSL”.

Description

The S4 class “BSL” is produced by running functionbsl and contains the result of a BSL run. Basic S4 methodsshow,summary andplot are provided.theta andloglike returns the MCMC samples of parameter values and estimatedlog-likelihoods.

Usage

## S4 method for signature 'BSL'show(object)## S4 method for signature 'BSL'summary(object, burnin = 0, thetaNames = NULL)## S4 method for signature 'BSL,ANY'plot(  x,  which = 1L,  thin = 1,  burnin = 0,  thetaTrue = NULL,  options.plot = NULL,  top = "Approximate Univariate Posteriors",  options.density = list(),  options.theme = list())## S4 method for signature 'BSL'getTheta(object, burnin = 0, thin = 1)## S4 method for signature 'BSL'getLoglike(object, burnin = 0, thin = 1)## S4 method for signature 'BSL'getGamma(object, burnin = 0, thin = 1)

Arguments

object

A “BSL” class object.

burnin

the number of MCMC burn-in steps to be taken.

thetaNames

Parameter names to be shown in the summary table. If notgiven, parameter names of the “BSL” object will be used by default.

x

A “BSL” class object to plot.

which

An integer argument indicating which plot function to beused. The default,1L, uses the plainplot to visualise theresult.2L uses ggplot2 to draw the plot.

thin

A numeric argument indicating the gap between samples tobe taken when thinning the MCMC draws. The default is1, which meansno thinning is used.

thetaTrue

A set of true parameter values to be included on the plotsas a reference line. The default isNULL.

options.plot

A list of additional arguments to pass into theplot function. Only use whenwhich is1L.

top

A character argument of the combined plot title ifwhich is2L.

options.density

A list of additional arguments to pass into thegeom_density function. Only use whenwhich is2L.

options.theme

A list of additional arguments to pass into thetheme function. Only use whenwhich is2L.

Slots

theta: Object of class “matrix”. MCMC samples from the jointapproximate posterior distribution of the parameters.
loglike: Object of class “numeric”. Accepted MCMC samples of theestimated log-likelihood values.
call: Object of class “call”. The original code that was used to callthe method.
model: Object of class “MODEL”.
acceptanceRate: Object of class “numeric”. The acceptance rate of theMCMC algorithm.
earlyRejectionRate: Object of class “numeric”. The early rejectionrate of the algorithm (early rejection may occur when using bounded priordistributions).
errorRate: Object of class “numeric”. The error rate. If any infinitesummary statistic or infinite log-likelihood estimate occurs during theprocess, it is marked as an error and the proposed parameter will berejected.
y: Object of class “ANY”. The observed data.
n: Object of class “numeric”. The number of simulations from the modelper MCMC iteration to estimate the synthetic likelihood.
M: Object of class “numeric”. The number of MCMC iterations.
covRandWalk: Object of class “matrix”. The covariance matrix used inmultivariate normal random walk proposals.
method: Object of class “character”. The character argument indicatingthe used method.
shrinkage: Object of class “characterOrNULL”. The character argumentindicating the shrinkage method.
penalty: Object of class “numericOrNULL”. The penalty value.
GRC: Object of class “logical”. Whether the Gaussian rank correlationmatrix is used.
logitTransform: Object of class “logical”. The logical argumentindicating whether a logit transformation is used in the algorithm.
logitTransformBound: Object of class “matrixOrNULL”. The matrix oflogitTransformBound.
standardise: Object of class “logical”. The logical argument thatdetermines whether to standardise the summary statistics.
parallel: Object of class “logical”. The logical value indicatingwhether parallel computing is used in the process.
parallelArgs: Object of class “listOrNULL”. The list of additionalarguments to pass into theforeach function.
time: Object of class “difftime”. The running time.
gamma: Object of class “numeric”. MCMC samples of gamma parametervalues of the mean adjustment or variance inflation for method“BSLmisspec”.
misspecType: Object of class “characterOrNULL”. The character argumentindicating whether mean adjustment ("mean") or variance inflation("variance") to be used in "BSLmisspec" method.
tau: Object of class “numeric”. Parameter of the prior distributionfor "BSLmisspec" method. For mean adjustment,tau is the scale ofthe Laplace distribution. For variance inflation,tau is the mean ofthe exponential distribution.
whitening: Object of class “logicalOrMatrixOrNULL”. A logical argumentdetermines whether Whitening transformation is used in “BSL” method withWarton's shrinkage, or just the Whitening matrix used.

Examples

## Not run: # a toy exampletoy_simVec <- function(n, theta) matrix(rnorm(n, theta), nrow = n) # the simulation functiontoy_sum <- function(x) x # the summary statistic functionmodel <- newModel(fnSimVec = toy_simVec, fnSum = toy_sum, theta0 = 0) # create the model objectresult_toy <- bsl(y = 1, n = 100, M = 1e4, model = model, covRandWalk = matrix(1))summary(result_toy)plot(result_toy)## End(Not run)

S4 class “MODEL”

Description

The S4 class contains the simulation and summary statisticsfunction and other necessary arguments for a model to run in the mainbsl function.

newModel is the constructor function for aMODELobject.

simulation runs a number of simulations and computes thecorreponding summary statistics with the provided model.

summStat computes the summary statistics with the given data and model object.The summary statistics function and relevant arguments are obtained from the model.

Usage

newModel(  fnSim,  fnSimVec,  fnSum,  fnLogPrior,  simArgs,  sumArgs,  theta0,  thetaNames,  test = TRUE,  verbose = TRUE)## S4 method for signature 'MODEL'simulation(  model,  n = 1,  theta = model@theta0,  summStat = TRUE,  parallel = FALSE,  parallelArgs = NULL,  seed = NULL)## S4 method for signature 'ANY,MODEL'summStat(x, model)

Arguments

fnSim

A function that simulates data for a given parametervalue. The first argument should be the parameters. Other necessaryarguments (optional) can be specified withsimArgs.

fnSimVec

A vectorised function that simulates a number ofdatasets simultaneously for a given parameter value. The first twoarguments should be the number of simulations to run and parameters,respectively. Other necessary arguments (optional) can be specified withsimArgs. The output must be a list of each simulation result or amatrix with each row corresponding to a simulation.

fnSum

A function for computing summary statistics of data. Thefirst argument should be the observed or simulated dataset. Other necessaryarguments (optional) can be specified withsumArgs.

fnLogPrior

A function that computes the log of prior density for aparameter. If this is missing, the prior by default is an improper flatprior over the real line for each parameter. The function must have asingle input: a vector of parameter values.

simArgs

A list of additional arguments to pass into the simulationfunction. Only use when the inputfnSim requires additionalarguments.

sumArgs

A list of additional arguments to pass into the summarystatistics function. Only use when the inputfnSum requiresadditional arguments.

theta0

Initial guess of the parameter value.

thetaNames

A string vector of parameter names, which must have thesame length as the parameter vector.

test

Logical, whether a short simulation test will be ranupon initialisation.

verbose

Logical, whether to print verbose messages wheninitialising a “MODEL” object.

model

A “MODEL” class object.

n

The number of simulations to run.

theta

The parameter value.

summStat

Logical indicator whether the correpsonding summary statisticsshould be returned or not. The default isTRUE.

parallel

A logical value indicating whether parallel computing shouldbe used for simulation and summary statistic evaluation. The default isFALSE. When model simulation is fast, it may be preferable toperform serial or vectorised computations to avoid significantcommunication overhead between workers. Parallel computation can only beused if not using a vectorised simulation function, seeMODELfor options of vectorised simulation function.

parallelArgs

A list of additional arguments to pass into theforeach function. Only used when parallel computing is enabled,default isNULL.

seed

A seed number to pass to theset.seed function. Thedefault isNULL, when no seed number is specified. Please noteparallel also affects the result even with the same seed.

x

The data to pass to the summary statistics function.

Value

A list of simulation results using the given parameter.xcontains the raw simulated datasets.ssx contains the summarystatistics.

A vector of the summary statistics.

Slots

fnSim: A function that simulates data for a given parameter value. Thefirst argument should be the parameters. Other necessary arguments(optional) can be specified withsimArgs.
fnSimVec: A vectorised function that simulates a number of datasetssimultaneously for a given parameter value. If this is notNULL,vectorised simulation function will be used instead offnSim. Thefirst two arguments should be the number of simulations to run andparameters, respectively. Other necessary arguments (optional) can bespecified withsimArgs. The output must be a list of each simulationresult.
fnSum: A function for computing summary statistics of data. The firstargument should be the observed or simulated dataset. Other necessaryarguments (optional) can be specified withsumArgs. The users shouldcode this function carefully so the output have fixed length and nevercontain anyInf value.
fnLogPrior: A function that computes the log of prior density for aparameter. The default isNULL, which uses an improper flat priorover the real line for each parameter. The function must have a singleinput: a vector of parameter values.
simArgs: A list of additional arguments to pass into the simulationfunction. Only use when the inputfnSim orfnSimVec requiresadditional arguments. The default isNULL.
sumArgs: A list of additional arguments to pass into the summarystatistics function. Only use when the inputfnSum requiresadditional arguments. The default isNULL.
theta0: Initial guess of the parameter value, which is used as thestarting value for MCMC.
thetaNames: Expression, parameter names.
ns: The number of summary statistics of a single observation. Note thiswill be generated automatically, thus is not required for initialisation.
test: Logical, whether a short simulation test will be ran uponinitialisation.
verbose: Logical, whether to print verbose messages when initialising a“MODEL” object.

Examples

# set up the model for the ma2 exampledata(ma2)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args,                  theta0 = ma2$start, fnLogPrior = ma2_prior, verbose = FALSE)validObject(m)# benchmark the serial and vectorised simulation function (require the rbenchmark package)m1 <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args,            theta0 = ma2$start, fnLogPrior = ma2_prior)m2 <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args,            theta0 = ma2$start, fnLogPrior = ma2_prior)require("rbenchmark")## Not run: benchmark(serial  = simulation(m1, n = 1000, theta = c(0.6, 0.2)),          vectorised  = simulation(m2, n = 1000, theta = c(0.6, 0.2)))## End(Not run)

S4 class “PENALTY”

Description

This S4 class contains the penalty selection result fromfunctionselectPenalty.show display the penaltyselection result.plot plot the penalty selection result usingggplot2.

Usage

## S4 method for signature 'PENALTY'show(object)## S4 method for signature 'PENALTY,ANY'plot(x, logscale = TRUE)## S4 method for signature 'BSL'getPenalty(object)

Arguments

object

The S4 object of class “PENALTY” to show.

x

The S4 object of class “PENALTY” to plot.

logscale

A logical argument whether the x-axis (penalty) should be log transformed. Thedefault isTRUE.

Slots

loglike: A list of the log-likelihood values. The list contains multiplematrices (each corresponds to the result for a specific n value). Thenumber of row of the matrix equals to the number of repeatsM. Thecolumns of the matrix stands for different penalty values.
n: A vector ofn, the number of simulations from the model perMCMC iteration for estimating the synthetic likelihood.
lambda: A list, with each entry containing the vector of penalty valuesfor the corresponding choice ofn.
M: The number of repeats used in estimating the standard deviation ofthe estimated log synthetic likelihood.
sigma: The standard deviation of the log synthetic likelihood estimatorto aim for, usually a value between 1 and 2. This reflects the mixing of aMarkov chain.
model: A “MODEL” object generated with functionnewModel.SeenewModel.
stdLoglike: A list contains the estimated standard deviations oflog-likelihoods.
penalty: The vector stores the selected penalty values for each givenn by choosing from the candidatelambda list. The selectedvalues produce closest standard deviationsstdLoglike to the targetsigma.
result: The result data frame.
call: The original code used to runselectPenalty.

Performing BSL, uBSL, semiBSL and BSLmisspec

Description

This is the main function for performing MCMC BSL (with astandard or non-standard likelihood estimator) to sample from theapproximate posterior distribution. A couple of extentions to the standardapproach are available by changing the following arguments,method,shrinkage,whitening,misspecType. Parallel computingis supported with the R packageforeach.

Usage

bsl(  y,  n,  M,  model,  covRandWalk,  theta0,  fnSim,  fnSum,  method = c("BSL", "uBSL", "semiBSL", "BSLmisspec"),  shrinkage = NULL,  penalty = NULL,  fnPrior = NULL,  simArgs = NULL,  sumArgs = NULL,  logitTransformBound = NULL,  standardise = FALSE,  GRC = FALSE,  whitening = NULL,  misspecType = NULL,  tau = 1,  parallel = FALSE,  parallelArgs = NULL,  thetaNames = NULL,  plotOnTheFly = FALSE,  verbose = 1L)

Arguments

y

The observed data. Note this should be the raw dataset NOT theset of summary statistics.

n

The number of simulations from the model per MCMC iteration forestimating the synthetic likelihood.

M

The number of MCMC iterations.

model

A “MODEL” object generated with functionnewModel. SeenewModel.

covRandWalk

The covariance matrix of a multivariate normal random walkproposal distribution used in the MCMC.

theta0

Deprecated, will be removed in the future, usemodelinstead. Initial guess of the parameter value, which is used as thestarting value for MCMC.

fnSim

Deprecated, will be removed in the future, usemodel instead. A function that simulates data for a given parametervalue. The first argument should be the parameters. Other necessaryarguments (optional) can be specified withsimArgs.

fnSum

Deprecated, will be removed in the future, usemodel instead. A function for computing summary statistics of data.The first argument should be the observed or simulated dataset. Othernecessary arguments (optional) can be specified withsumArgs.

method

A string argument indicating the method to be used. Thedefault, “BSL”, runs standard BSL. “uBSL” uses the unbiased estimatorof a normal density of Ghurye and Olkin (1969). “semiBSL”runs the semi-parametric BSL algorithm and is more robust to non-normalsummary statistics. “BSLmisspec” estimates the Gaussian syntheticlikelihood whilst acknowledging that there may be incompatibility betweenthe model and the observed summary statistic (Frazier and Drovandi 2021).

shrinkage

A string argument indicating which shrinkage method tobe used. The default isNULL, which means no shrinkage is used.Shrinkage estimation is only available for methods “BSL” and “semiBSL”.Current options are “glasso” for the graphical lasso method ofFriedman et al. (2008) and “Warton” for the ridgeregularisation method of Warton (2008).

penalty

The penalty value to be used for the specified shrinkagemethod. Must be between zero and one if the shrinkage method is “Warton”.

fnPrior

Deprecated, will be removed in the future, usemodel instead. A function that computes the log prior density for aparameter. The default isNULL, which uses an improper flat priorover the real line for each parameter. The function must have a singleinput: a vector of parameter values.

simArgs

Deprecated, will be removed in the future, usemodel instead. A list of additional arguments to pass into thesimulation function. Only use when the inputfnSim requiresadditional arguments. The default isNULL.

sumArgs

Deprecated, will be removed in the future, usemodel instead. A list of additional arguments to pass into thesummary statistics function. Only use when the inputfnSum requiresadditional arguments. The default isNULL.

logitTransformBound

Ap by2 numeric matrix indicating theupper and lower bounds of parameters if a logit transformation is used onthe parameter space, wherep is the number of parameters. The defaultisNULL, which means no logit transformation is used. It is alsopossible to define other transformations within the simulation and priorfunction frommodel. The first column contains the lower bound ofeach parameter and the second column contains the upper bound. Infinitelower or upper bounds are also supported, eg.matrix(c(1,Inf,0,10,-Inf,0.5),3,2,byrow=TRUE).

standardise

A logical argument that determines whether to standardisethe summary statistics before applying the graphical lasso. This is onlyvalid if method is “BSL”, shrinkage is “glasso” and penalty is notNULL. The diagonal elements will not be penalised if the shrinkagemethod is “glasso”. The default isFALSE.

GRC

A logical argument indicating whether the Gaussian rankcorrelation matrix (Boudt et al. 2012) should be used to estimatethe covariance matrix in “BSL” method. The default isFALSE, whichuses the sample covariance by default.

whitening

This argument determines whether Whitening transformationshould be used in “BSL” method with Warton's shrinkage. Whiteningtransformation helps decorrelate the summary statistics, thus encouragingsparsity of the synthetic likelihood covariance matrix. This might allowheavier shrinkage to be applied without losing much accuracy, henceallowing the number of simulations to be reduced. By default,NULLrepresents no Whitening transformation. Otherwise this is enabled if aWhitening matrix is provided. SeeestimateWhiteningMatrix forthe function to estimate the Whitening matrix.

misspecType

A string argument indicating which type of modelmisspecification to be used. The two options are "mean" and "variance".Only used when method is “BSLmisspec”. The default,NULL, means nomodel misspecification is considered.

tau

A numeric argument, parameter of the prior distributionfor "BSLmisspec" method. For mean adjustment,tau is the scale ofthe Laplace distribution. For variance inflation,tau is the mean ofthe exponential distribution. Only used when method is “BSLmisspec”.

parallel

parallelArgs

A list of additional arguments to pass into theforeach function. Only used when parallel computing is enabled,default isNULL.

thetaNames

Deprecated, will be removed in the future, usemodelinstead. A string vector of parameter names, which must have the samelength as the parameter vector. The default isNULL.

plotOnTheFly

A logical or numeric argument defining whether or by howmany iterations a posterior figure will be plotted during running. IfTRUE, a plot of approximate univariate posteriors based on thecurrent accepted samples will be shown every one thousand iterations.The default isFALSE.

verbose

An integer indicating the verbose style. 0Lmeans no verbose messages will be printed. 1L uses a custom progress bar totrack the progress. 2L prints the iteration numbers (1:M) to trackthe progress. The default is 1L.

Value

An object of classbsl is returned, seeBSLfor more information of the S4 class.

Author(s)

Ziwen An, Leah F. South and Christopher Drovandi

References

Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.

Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.

Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.

Ghurye SG, Olkin I (1969).“Unbiased Estimation of Some Multivariate Probability Densities and Related Functions.”Ann. Math. Statist.,40(4), 1261–1271.

Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.

Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.

An Z, Nott DJ, Drovandi C (2019).“Robust Bayesian Synthetic Likelihood via a Semi-Parametric Approach.”Statistics and Computing (In Press).

Examples

## Not run: # This is just a minimal test run, please see package built-in examples for more# comprehensive usages of the functiontoy_sim <- function(n, theta) matrix(rnorm(n, theta), nrow = n)toy_sum <- function(x) xmodel <- newModel(fnSimVec = toy_sim, fnSum = toy_sum, theta0 = 0)result_toy <- bsl(y = 1, n = 100, M = 1e4, model = model, covRandWalk = matrix(1),    method = "BSL", plotOnTheFly = TRUE)summary(result_toy)plot(result_toy)## End(Not run)

Cell biology example

Description

This example estimates the probabilities of cell motility andcell proliferation for a discrete-time stochastic model of cell spreading.We provide the data and tuning parameters required to reproduce the resultsin An et al. (2019).

Usage

data(ma2)cell_sim(theta, Yinit, rows, cols, sim_iters, num_obs)cell_sum(Y, Yinit)cell_prior(theta)

Arguments

theta

A vector of proposed model parameters,P_m andP_p.

Yinit

The initial matrix of cell presences of sizerows\timescols.

rows

The number of rows in the lattice (rows in the cell locationmatrix).

cols

The number of columns in the lattice (columns in the celllocation matrix).

sim_iters

The number of discretisation steps to get to when anobservation is actually taken. For example, if observations are taken every5 minutes but the discretisation level is 2.5 minutes, thensim_iters would be 2. Larger values ofsim_iters lead to more“accurate” simulations from the model, but they also increase thesimulation time.

num_obs

The total number of images taken after initialisation.

Y

Arows\timescols\timesnum_obs arrayof the cell presences at times1:num_obs (not time 0).

Details

Cell motility (movement) and proliferation (reproduction) causetumors to spread and wounds to heal. If we can measure cell proliferationand cell motility under different situations, then we may be able to usethis information to determine the efficacy of different medical treatments.

A common method for measuring in vitro cell movement and proliferation isthe scratch assay. Cells form a layer on an assay and, once they arecompletely covering the assay, a scratch is made to separate the cells.Images of the cells are taken until the scratch has closed up and the cellsare in contact again. Each image can be converted to a binary matrix byforming a lattice and recording the binary matrix (of sizerows\timescols) of cell presences.

The model that we consider is a random walk model with parameters for theprobability of cell movement(P_m) and the probabilityof cell proliferation(P_p) and it has notractable likelihood function. We use the vague priorsP_m \sim U(0,1)andP_p \sim U(0,1).

We have a total of 145 summary statistics, which are made up of the Hammingdistances between the binary matrices for each time point and the totalnumber of cells at the final time.

Details about the types of cells that this model is suitable for and otherinformation can be found in Price et al. (2018) andAn et al. (2019). Johnston et al. (2014)use a different ABC method and different summary statistics for a similarexample.

Functions

cell_sim: The functioncell_sim(theta, Yinit, rows, cols,sim_iters, num_obs) simulates data from the model, using C++ in thebackend.
cell_sum: The functioncell_sum(Y,sum_options) calculates thesummary statistics for this example.
cell_prior: The functioncell_prior(theta) evaluates the logprior density at the parameter value\theta.

A simulated dataset

An example “observed” dataset and the tuning parameters relevant to thatexample can be obtained usingdata(cell). This “observed” data isa simulated dataset withP_m = 0.35 andP_p = 0.001. The lattice has 27rows and 36cols and there arenum_obs = 144 observations after time 0(to mimic images being taken every 5 minutes for 12 hours). The simulationis based on there initially being 110 cells in the assay.

Further information about the specific choices of tuning parameters used inBSL and BSLasso can be found in An et al. (2019).

data: Therows\timescols\timesnum_obs array of the cellpresences at times 1:144.
sim_args: Values ofsim_args relevant to this example.
sum_args: Values ofsum_args relevant to this example,i.e. just the value ofYinit.
start: A vector of suitable initial values of the parametersfor MCMC.
cov: The covariance matrix of a multivariate normal randomwalk proposal distribution used in the MCMC, in the form of a 2\times 2 matrix.

Author(s)

Ziwen An, Leah F. South and Christopher Drovandi

References

An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.

Johnston ST, Simpson MJ, McElwain DLS, Binder BJ, Ross JV (2014).“Interpreting scratch assays using pair density dynamics and approximate Bayesian computation.”Open Biology,4(9), 140097.doi:10.1098/rsob.140097.

Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.

Examples

## Not run: require(doParallel) # You can use a different package to set up the parallel backend# Loading the data for this exampledata(cell)model <- newModel(fnSim = cell_sim, fnSum = cell_sum, simArgs = cell$sim_args,                  sumArgs = cell$sum_args, theta0 = cell$start, fnLogPrior = cell_prior,                  thetaNames = expression(P[m], P[p]))thetaExact <- c(0.35, 0.001)# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCellBSL <- bsl(cell$data, n = 5000, M = 10000, model = model, covRandWalk = cell$cov,                     parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCellBSL)summary(resultCellBSL)plot(resultCellBSL, thetaTrue = thetaExact, thin = 20)# Performing uBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCelluBSL <- bsl(cell$data, n = 5000, M = 10000, model = model, covRandWalk = cell$cov,                      method = "uBSL", parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCelluBSL)summary(resultCelluBSL)plot(resultCelluBSL, thetaTrue = thetaExact, thin = 20)# Performing tuning for BSLassossy <- cell_sum(cell$data, cell$sum_args$Yinit)lambda_all <- list(exp(seq(0.5,2.5,length.out=20)), exp(seq(0,2,length.out=20)),                   exp(seq(-1,1,length.out=20)), exp(seq(-1,1,length.out=20)))# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)set.seed(100)sp_cell <- selectPenalty(ssy, n = c(500, 1000, 1500, 2000), lambda_all, theta = thetaExact,    M = 100, sigma = 1.5, model = model, method = "BSL", shrinkage = "glasso",    parallelSim = TRUE, parallelMain = FALSE)stopCluster(cl)registerDoSEQ()sp_cellplot(sp_cell)# Performing BSLasso with a fixed penalty (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCellBSLasso <- bsl(cell$data, n = 1500, M = 10000, model = model, covRandWalk = cell$cov,                          shrinkage = "glasso", penalty = 1.3, parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCellBSLasso)summary(resultCellBSLasso)plot(resultCellBSLasso, thetaTrue = thetaExact, thin = 20)# Performing semiBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCellSemiBSL <- bsl(cell$data, n = 5000, M = 10000, model = model, covRandWalk = cell$cov,                          method = "semiBSL", parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCellSemiBSL)summary(resultCellSemiBSL)plot(resultCellSemiBSL, thetaTrue = thetaExact, thin = 20)# Plotting the results together for comparison# plot using the R default plot functionoldpar <- par()par(mar = c(5, 4, 1, 2), oma = c(0, 1, 2, 0))combinePlotsBSL(list(resultCellBSL, resultCelluBSL, resultCellBSLasso, resultCellSemiBSL),    which = 1, thetaTrue = thetaExact, thin = 20, label = c("bsl", "ubsl", "bslasso", "semiBSL"),    col = 1:4, lty = 1:4, lwd = 1)mtext("Approximate Univariate Posteriors", outer = TRUE, cex = 1.5)par(mar = oldpar$mar, oma = oldpar$oma)## End(Not run)

Plot the densities of multiple “bsl” class objects.

Description

The functioncombinePlotsBSL can be used to plot multipleBSL densities together, optionally with the true values for the parameters.

Usage

combinePlotsBSL(  objectList,  which = 1L,  thin = 1,  burnin = 0,  thetaTrue = NULL,  label = NULL,  legendPosition = c("auto", "right", "bottom")[1],  legendNcol = NULL,  col = NULL,  lty = NULL,  lwd = NULL,  cex.lab = 1,  cex.axis = 1,  cex.legend = 0.75,  top = "Approximate Marginal Posteriors",  options.color = list(),  options.linetype = list(),  options.size = list(),  options.theme = list())

Arguments

objectList

A list of “bsl” class objects.

which

An integer argument indicating which plot function to beused. The default,1L, uses the plainplot to visualise theresult.2L uses ggplot2 to draw the plot.

thin

A numeric argument indicating the gap between samples tobe taken when thinning the MCMC draws. The default is1, which meansno thinning is used.

burnin

the number of MCMC burn-in steps to be taken.

thetaTrue

A set of true parameter values to be included on the plotsas a reference line. The default isNULL.

label

A string vector indicating the labels to be shown inthe plot legend. The default isNULL, which uses the names fromobjectList.

legendPosition

One of the three string arguments, “auto”, “right”or “bottom”, indicating the legend position. The default is “auto”,which automatically choose from “right” and “bottom”. Only used whenwhich is1L.

legendNcol

An integer argument indicating the number of columns ofthe legend. The default,NULL, put all legends in the same row orcolumn depending onlegendPosition. Only used whenwhich is1L.

col

A vector argument containing the plotting color foreach density curve. Each element of the vector will be passed intolines. Only used whenwhich is1L.

lty

A vector argument containing the line type for eachdensity curve. Each element of the vector will be passed intolines.Only used whenwhich is1L.

lwd

A vector argument containing the line width for eachdensity curve. Each element of the vector will be passed intolines.Only used whenwhich is1L.

cex.lab

The magnification to be used for x and y labelsrelative to the current setting of cex. To be passed intoplot. Onlyused whenwhich is1L.

cex.axis

The magnification to be used for axis annotationrelative to the current setting of cex. To be passed intoplot. Onlyused whenwhich is1L.

cex.legend

The magnification to be used for legend annotationrelative to the current setting of cex. Only used whenwhich is1L.

top

A string argument of the combined plot title. Only usedwhenwhich is2L.

options.color

A list of additional arguments to pass into functionggplot2::scale_color_manual. Only used whenwhich is2L.

options.linetype

A list of additional arguments to pass into functionggplot2::scale_linetype_manual. Only used whenwhich is2L.

options.size

A list of additional arguments to pass into functionggplot2::scale_size_manual. Only used whenwhich is2L.

options.theme

A list of additional arguments to pass into thetheme function. Only use whenwhich is2L.

Value

No return value, called for the plots produced.

Examples

## Not run: toy_sim <- function(n, theta) matrix(rnorm(2*n, theta), nrow = n)toy_sum <- ma2_summodel <- newModel(fnSimVec = toy_sim, fnSum = toy_sum, sumArgs = list(epsilon = 2), theta0 = 0)result1 <- bsl(y = 1:2, n = 100, M = 5e3, model = model, covRandWalk = matrix(1),               method = "BSL", plotOnTheFly = TRUE)result2 <- bsl(y = 1:2, n = 100, M = 5e3, model = model, covRandWalk = matrix(1),               method = "uBSL", plotOnTheFly = TRUE)result3 <- bsl(y = 1:2, n = 100, M = 5e3, model = model, covRandWalk = matrix(1),               method = "semiBSL", plotOnTheFly = TRUE)combinePlotsBSL(list(result1, result2, result3), label = c("BSL","uBSL","semiBSL"), thin = 20)## End(Not run)

Convert a correlation matrix to a covariance matrix

Description

This function converts a correlation matrix to a covariance matrix

Usage

cor2cov(corr, std)

Arguments

corr

The correlation matrix to be converted. This must be symmetric.

std

A vector that contains the standard deviations of the variables in the correlation matrix.

Value

The covariance matrix.

Estimate the synthetic likelihood

Description

This function computes the estimated synthetic (log) likelihoodusing one of the four methods (“BSL”, “uBSL”, “semiBSL” and“BSLmisspec”). Please find the links below in the see also section formore details.

Usage

estimateLoglike(  ssy,  ssx,  method = c("BSL", "uBSL", "semiBSL", "BSLmisspec"),  log = TRUE,  verbose = FALSE,  ...)

Arguments

ssy

The observed summary statisic.

ssx

A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration.

method

log

A logical argument indicating if the log of likelihood isgiven as the result. The default isTRUE.

verbose

A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault isFALSE.

...

Arguments to be passed to methods.

shrinkage Available for methods “BSL” and “semiBSL”. Astring argument indicating which shrinkage method to be used. The defaultisNULL, which means no shrinkage is used. Shrinkage estimation isonly available for methods “BSL” and “semiBSL”. Current options are“glasso” for the graphical lasso method ofFriedman et al. (2008) and “Warton” for the ridgeregularisation method of Warton (2008).
penalty Available for methods “BSL” and “semiBSL”. Thepenalty value to be used for the specified shrinkage method. Must bebetween zero and one if the shrinkage method is “Warton”.
standardise Available for method “BSL”. A logical argumentthat determines whether to standardise the summary statistics beforeapplying the graphical lasso. This is only valid if method is “BSL”,shrinkage is “glasso” and penalty is notNULL. The diagonalelements will not be penalised if the shrinkage method is “glasso”. Thedefault isFALSE.
GRC Available for method “BSL”. A logical argumentindicating whether the Gaussian rank correlation matrix(Boudt et al. 2012) should be used to estimate the covariancematrix in “BSL” method. The default isFALSE, which uses thesample covariance by default.
whitening Available for method “BSL”. This argument determineswhether Whitening transformation should be used in “BSL” method withWarton's shrinkage. Whitening transformation helps decorrelate the summarystatistics, thus encourages sparsity of the synthetic likelihood covariancematrix. This might allow heavier shrinkage to be applied without losingmuch accuracy, hence allowing the number of simulations to be reduced. Bydefault,NULL represents no Whitening transformation. Otherwise thisis enabled if a Whitening matrix is provided. SeeestimateWhiteningMatrix for the function to estimate theWhitening matrix.
ssyTilde Available for method “BSL”. The whitened observedsummary statisic. If this is notNULL, it will be used to savecomputation effort. Only used if Whitening is enabled.
kernel Available for method “semiBSL”. A string argumentindicating the smoothing kernel to pass intodensity for estimatingthe marginal distribution of each summary statistic. Only “gaussian" and“epanechnikov" are available. The default is “gaussian".
type Available for method “BSLmisspec”. A string argumentindicating which method is used to account for and detect potentialincompatibility. The two options are "mean" and "variance".
gamma Available for method “BSLmisspec”. The additionallatent parameter to account for possible incompatability between the modeland observed summary statistic. In “BSLmisspec” method, this is updatedwith a slice sampler (Neal 2003).

Value

The estimated synthetic (log) likelihood value.

References

Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.

Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.

Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.

Ghurye SG, Olkin I (1969).“Unbiased Estimation of Some Multivariate Probability Densities and Related Functions.”Ann. Math. Statist.,40(4), 1261–1271.

Neal RM (2003).“Slice sampling.”The Annals of Statistics,31(3), 705–767.

Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.

Examples

data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args,                  theta0 = ma2$start)ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssxestimateLoglike(ssy, ssx, method = "BSL")estimateLoglike(ssy, ssx, method = "uBSL")estimateLoglike(ssy, ssx, method = "semiBSL")estimateLoglike(ssy, ssx, method = "BSLmisspec", type = "mean", gamma = rep(0.1, 50))

Estimate the Whitening matrix to be used in the “wBSL” method ofPriddle et al. (2021)

Description

This function estimates the Whitening matrix to be used in BSLwith Warton's shrinkage and Whitening (“wBSL” method ofPriddle et al. (2021)). The Whitening transformation anddecorrelation methods are detailed in Kessy et al. (2018).

Usage

estimateWhiteningMatrix(  n,  model,  method = c("PCA", "ZCA", "Cholesky", "ZCA-cor", "PCA-cor"),  thetaPoint = NULL,  parallel = FALSE,  parallelArgs = NULL)

Arguments

n

The number of model simulations to estimate the Whitening matrix.

model

A “MODEL” object generated with functionnewModel. SeenewModel.

method

The type of Whitening method to be used. The default is“PCA”.

thetaPoint

A point estimate of the parameter value with non-negligibleposterior support.

parallel

parallelArgs

A list of additional arguments to pass into theforeach function. Only used when parallel computing is enabled,default isNULL.

Value

The estimated Whitening matrix.

References

Kessy A, Lewin A, Strimmer K (2018).“Optimal Whitening and Decorrelation.”The American Statistician,72(4), 309–314.doi:10.1080/00031305.2016.1277159.

Priddle JW, Sisson SA, Frazier DT, Turner I, Drovandi C (2021).“Efficient Bayesian Synthetic Likelihood with Whitening Transformations.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1909.04857.

Examples

## Not run: data(ma2)model <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start)W <- estimateWhiteningMatrix(20000, model, method = "PCA", thetaPoint = c(0.6, 0.2))## End(Not run)

Gaussian rank correlation

Description

This function computes the Gaussian rank correlation ofBoudt et al. (2012).

Usage

gaussianRankCorr(x, vec = FALSE)

Arguments

x

A numeric matrix representing data where the number of rowsis the number of independent data points and the number of columns is thenumber of variables in the dataset.

vec

A logical argument indicating if the vector of correlationsshould be returned instead of a matrix.

Value

Gaussian rank correlation matrix (default) or a vector ofpair correlations.

References

Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.

Examples

data(ma2)model <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = list(TT = 10),                  theta0 = ma2$start, fnLogPrior = ma2_prior)set.seed(100)# generate 1000 simualtions from the ma2 modelx <- simulation(model, n = 1000, theta = c(0.6, 0.2))$xcorr1 <- cor(x) # traditional correlation matrixcorr2 <- gaussianRankCorr(x) # Gaussian rank correlation matrixoldpar <- par()par(mfrow = c(1, 2))image(corr1, main = 'traditional correlation matrix')image(corr2, main = 'Gaussian rank correlation matrix')par(mfrow = oldpar$mfrow)std <- apply(x, MARGIN = 2, FUN = sd) # standard deviationscor2cov(gaussianRankCorr(x), std) # convert to covariance matrix

Estimate the Gaussian synthetic (log) likelihood

Description

This function estimates the Gaussian synthetic log-likelihood(see Wood 2010 and Price et al. 2018). Several extensions areprovided in this function:shrinkage enables shrinkage estimation ofthe covariance matrix and is helpful to bring down the number of modelsimulations (see An et al. (2019) for an example of BSLwith glasso (Friedman et al. 2008) shrinkage estimation);GRC uses Gaussian rank correlation (Boudt et al. 2012) tofind a more robust correlation matrix;whitening(Kessy et al. 2018) could further reduce the number of modelsimulations upon Warton's shrinkage (Warton 2008) bydecorrelating the summary statistics.

Usage

gaussianSynLike(  ssy,  ssx,  shrinkage = NULL,  penalty = NULL,  standardise = FALSE,  GRC = FALSE,  whitening = NULL,  ssyTilde = NULL,  log = TRUE,  verbose = FALSE)

Arguments

ssy

The observed summary statisic.

ssx

A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration.

shrinkage

penalty

The penalty value to be used for the specified shrinkagemethod. Must be between zero and one if the shrinkage method is “Warton”.

standardise

GRC

whitening

ssyTilde

The whitened observed summary statisic. If this is notNULL, it will be used to save computation effort. Only used ifWhitening is enabled.

log

A logical argument indicating if the log of likelihood isgiven as the result. The default isTRUE.

verbose

A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault isFALSE.

Value

The estimated synthetic (log) likelihood value.

References

An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.

Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.

Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.

Kessy A, Lewin A, Strimmer K (2018).“Optimal Whitening and Decorrelation.”The American Statistician,72(4), 309–314.doi:10.1080/00031305.2016.1277159.

Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.

Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.

Wood SN (2010).“Statistical Inference for Noisy Nonlinear Ecological Dynamic Systems.”Nature,466, 1102–1107.doi:10.1038/nature09319.

Examples

data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args,              theta0 = ma2$start)ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssx# the standard Gaussian synthetic likelihood (the likelihood estimator used in BSL)gaussianSynLike(ssy, ssx)# the Gaussian synthetic likelihood with glasso shrinkage estimation# (the likelihood estimator used in BSLasso)gaussianSynLike(ssy, ssx, shrinkage = 'glasso', penalty = 0.1)# the Gaussian synthetic likelihood with Warton's shrinkage estimationgaussianSynLike(ssy, ssx, shrinkage = 'Warton', penalty = 0.9)# the Gaussian synthetic likelihood with Warton's shrinkage estimation and Whitening transformationW <- estimateWhiteningMatrix(20000, m)gaussianSynLike(ssy, ssx, shrinkage = 'Warton', penalty = 0.9, whitening = W)

Estimate the Gaussian synthetic (log) likelihood with an unbiased estimator

Description

This function computes an unbiased, nonnegative estimate of anormal density function from simulations assumed to be drawn from it. SeePrice et al. (2018) andGhurye and Olkin (1969).

Usage

gaussianSynLikeGhuryeOlkin(ssy, ssx, log = TRUE, verbose = FALSE)

Arguments

ssy

The observed summary statisic.

ssx

A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration.

log

A logical argument indicating if the log of likelihood isgiven as the result. The default isTRUE.

verbose

A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault isFALSE.

Value

The estimated synthetic (log) likelihood value.

References

Ghurye SG, Olkin I (1969).“Unbiased Estimation of Some Multivariate Probability Densities and Related Functions.”Ann. Math. Statist.,40(4), 1261–1271.

Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.

Examples

data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args,              theta0 = ma2$start)ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssx# unbiased estimate of the Gaussian synthetic likelihood# (the likelihood estimator used in uBSL)gaussianSynLikeGhuryeOlkin(ssy, ssx)

Obtain the gamma samples (the latent parameters for BSLmisspec method) from a"BSL" object

Description

seeBSLclass

Usage

getGamma(object, ...)

Arguments

object

A “BSL” class object.

...

Other arguments.

Value

The matrix of gamma samples (the latent parameters for BSLmisspecmethod), after removing burn-in and thinning.

Obtain the log-likelihoods from a "BSL" object

Description

seeBSLclass

Usage

getLoglike(object, ...)

Arguments

object

A “BSL” class object.

...

Other arguments.

Value

The vector of log likelihood evaluations, after removing burn-in and thinning.

Obtain the selected penalty values from a "PENALTY" object

Description

seePENALTYclass

Usage

getPenalty(object, ...)

Arguments

object

A “PENALTY” class object.

...

Other arguments.

Value

The selecty penalty values.

Obtain the samples from a "BSL" object

Description

seeBSLclass

Usage

getTheta(object, ...)

Arguments

object

A “BSL” class object.

...

Other arguments.

Value

The matrix of samples, after removing burn-in and thinning.

An MA(2) model

Description

In this example we wish to estimate the parameters of a simpleMA(2) time series model. We provide the data and tuning parameters requiredto reproduce the results in An et al. (2019).The journal article An et al. (2022) provides a fulldescription of how to use this package for the toad example.

Usage

data(ma2)ma2_sim(theta, TT)ma2_sim_vec(n, theta, TT)ma2_sum(x, epsilon = 0, delta = 1)ma2_prior(theta)

Arguments

theta

A vector of proposed model parameters,\theta_1 and\theta_2.

TT

The number of observations.

n

The number of simulations to run with the vectorisedsimulation function.

x

Observed or simulated data in the format of a vector of lengthTT.

epsilon

The skewness parameter in the sinh-arcsinh transformation.

delta

The kurtosis parameter in the sinh-arcsinh transformation.

Details

This example is based on estimating the parameters of a basic MA(2)time series model of the form

y_t = z_t + \theta_1 z_{t-1} + \theta_2 z_{t-2},

wheret=1,\ldots,TT andz_t \sim N(0,1)fort=-1,0,\ldots,TT. A uniformprior is used for this example, subject to the restrictions that-2<\theta_1<2,\theta_1+\theta_2>-1and\theta_1-\theta_2<1so that invertibility of the time series is satisfied. The summarystatistics are simply the full data.

Functions

ma2_sim: Simulates an MA(2) time series.
ma2_sim_vec: Simulates n MA(2) time series with a vectorised simulationfunction.
ma2_sum: Returns the summary statistics for a given data set. Theskewness and kurtosis of the summary statistics can be controlled via the\epsilon and\delta parameters. This is thesinh-arcsinnh transformation of Jones and Pewsey (2009). By default,the summary statistics function simply returns the raw data. Otherwise, thetransformation is introduced to motivate the “semiBSL” method.
ma2_prior: Evaluates the (unnormalised) log prior, which is uniformsubject to several restrictions related to invertibility of the timeseries.

A simulated dataset

An example “observed” dataset and the tuning parameters relevant to thatexample can be obtained usingdata(ma2). This “observed” data is asimulated dataset with\theta_1 = 0.6,\theta_2=0.2 andTT=50. Further information about this model and the specific choicesof tuning parameters used in BSL and BSLasso can be found in An et al.(2019).

data: A time series dataset, in the form of a vector of lengthTT
sim_args: A list containingTT=50
start: A vector of suitable initial values of the parametersfor MCMC
cov: The covariance matrix of a multivariate normal randomwalk proposal distribution used in the MCMC, in the form of a 2\times 2 matrix

Author(s)

Ziwen An, Leah F. South and Christopher Drovandi

References

An Z, South LF, Drovandi CC (2022).“BSL: An R Package for Efficient Parameter Estimation for Simulation-Based Models via Bayesian Synthetic Likelihood.”Journal of Statistical Software,101(11), 1–33.doi:10.18637/jss.v101.i11.

An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.

Jones MC, Pewsey A (2009).“Sinh-arcsinh distributions.”Biometrika,96(4), 761–780.ISSN 0006-3444.

Examples

## Not run: # Load the data for this example and set up the model objectdata(ma2)model <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args,                  theta0 = ma2$start, fnLogPrior = ma2_prior)thetaExact <- c(0.6, 0.2)# reduce the number of iterations M if desired for all methods below# Method 1: standard BSLresultMa2BSL <- bsl(y = ma2$data, n = 500, M = 300000, model = model, covRandWalk = ma2$cov,                    method = "BSL", verbose = 1L)show(resultMa2BSL)summary(resultMa2BSL)plot(resultMa2BSL, thetaTrue = thetaExact, thin = 20)# Method 2: unbiased BSLresultMa2uBSL <- bsl(y = ma2$data, n = 500, M = 300000, model = model, covRandWalk=ma2$cov,                     method = "uBSL", verbose = 1L)show(resultMa2uBSL)summary(resultMa2uBSL)plot(resultMa2uBSL, thetaTrue = thetaExact, thin = 20)# Method 3: BSLasso (BSL with glasso shrinkage estimation)# tune the penalty parameter fisrtssy <- ma2_sum(ma2$data)lambdaAll <- list(exp(seq(-5.5,-1.5,length.out=20)))set.seed(100)penaltyGlasso <- selectPenalty(ssy = ssy, n = 300, lambdaAll, theta = thetaExact,                        M = 100, sigma = 1.5, model = model, method = "BSL", shrinkage = "glasso")penaltyGlassoplot(penaltyGlasso)resultMa2BSLasso <- bsl(y = ma2$data, n = 300, M = 250000, model = model, covRandWalk=ma2$cov,                        method = "BSL", shrinkage = "glasso", penalty = 0.027, verbose = 1L)show(resultMa2BSLasso)summary(resultMa2BSLasso)plot(resultMa2BSLasso, thetaTrue = thetaExact, thin = 20)# Method 4: BSL with Warton's shrinkage and Whitening# estimate the Whtieing matrix and tune the penalty parameter firstW <- estimateWhiteningMatrix(20000, model, method = "PCA", thetaPoint = ma2$start)gammaAll <- list(seq(0.3, 0.8, 0.02))set.seed(100)penaltyWarton <- selectPenalty(ssy = ssy, n = 300, gammaAll, theta = thetaExact,                        M = 100, sigma = 1.2, model = model, method = "BSL", shrinkage = "Warton",                        whitening = W)penaltyWartonplot(penaltyWarton, logscale = FALSE)resultMa2Whitening <- bsl(y = ma2$data, n = 300, M = 250000, model = model, covRandWalk=ma2$cov,                        method = "BSL", shrinkage = "Warton", whitening = W,                        penalty = 0.52, verbose = 1L)show(resultMa2Whitening)summary(resultMa2Whitening)plot(resultMa2Whitening, thetaTrue = thetaExact, thin = 20)# Method 5: semiBSL, the summary statistics function is different from previous methodsmodel2 <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args,                  sumArgs = list(epsilon = 2), theta0 = ma2$start, fnLogPrior = ma2_prior)sim <- simulation(model, n = 1e4, theta = ma2$start, seed = 1) # run a short simulationplot(density(sim$ssx[, 1])) # the first marginal summary statistic is right-skewedresultMa2SemiBSL <- bsl(y = ma2$data, n = 500, M = 200000, model = model2, covRandWalk=ma2$cov,                        method = "semiBSL", verbose = 1L)show(resultMa2SemiBSL)summary(resultMa2SemiBSL)plot(resultMa2SemiBSL, thetaTrue = thetaExact, thin = 20)# Method 6: BSL with consideration of model misspecification (mean adjustment)resultMa2Mean <- bsl(y = ma2$data, n = 500, M = 200000, model = model, covRandWalk=ma2$cov,                        method = "BSLmisspec", misspecType = "mean", verbose = 1L)show(resultMa2Mean)summary(resultMa2Mean)plot(resultMa2Mean, thetaTrue = thetaExact, thin = 20)# Method 7: BSL with consideration of model misspecification (variance inflation)resultMa2Variance <- bsl(y = ma2$data, n = 500, M = 200000, model = model, covRandWalk=ma2$cov,                     method = "BSLmisspec", misspecType = "variance", verbose = 1L)show(resultMa2Variance)summary(resultMa2Variance)plot(resultMa2Variance, thetaTrue = thetaExact, thin = 20)# Plotting the results together for comparison# plot using the R default plot functionoldpar <- par()par(mar = c(5, 4, 1, 2), oma = c(0, 1, 2, 0))combinePlotsBSL(list(resultMa2BSL, resultMa2uBSL, resultMa2BSLasso, resultMa2SemiBSL), which = 1,                thetaTrue = thetaExact, thin = 20, label = c("bsl", "uBSL", "bslasso", "semiBSL"),                col = c("black", "red", "blue", "green"), lty = 1:4, lwd = 1)mtext("Approximate Univariate Posteriors", outer = TRUE, cex = 1.5)# plot using the ggplot2 packagecombinePlotsBSL(list(resultMa2BSL, resultMa2uBSL, resultMa2BSLasso, resultMa2SemiBSL), which = 2,    thetaTrue = thetaExact, thin = 20, label = c("bsl", "ubsl", "bslasso", "semiBSL"),    options.color = list(values=c("black", "red", "blue", "green")),    options.linetype = list(values = 1:4), options.size = list(values = rep(1, 4)),    options.theme = list(plot.margin = grid::unit(rep(0.03,4), "npc"),        axis.title = ggplot2::element_text(size=12), axis.text = ggplot2::element_text(size = 8),        legend.text = ggplot2::element_text(size = 12)))par(mar = oldpar$mar, oma = oldpar$oma)## End(Not run)

The multivariate G&K example

Description

Here we provide the data and tuning parameters required to reproducethe results from the multivariate G & K (Drovandi and Pettitt 2011) example from An et al. (2019).

Usage

data(mgnk)mgnk_sim(theta_tilde, TT, J, bound)mgnk_sum(y)

Arguments

theta_tilde

A vector with 15 elements for the proposed model parameters.

TT

The number of observations in the data.

J

The number of variables in the data.

bound

A matrix of boundaries for the uniform prior.

y

ATT\timesJ matrix of data.

Details

It is not practical to give a reasonable explanation of this example through R documentationgiven the number of equations involved. We refer the reader to the BSLasso paper (An et al. 2019)at <doi:10.1080/10618600.2018.1537928> for information on the model and summary statistic used in this example.

An example dataset

We use the foreign currency exchange data available fromhttps://www.rba.gov.au/statistics/historical-data.htmlas in An et al. (2019).

data: A1651\times3 matrix of data.
sim_args: Values ofsim_args relevant to this example.
start: A vector of suitable initial values of the parameters for MCMC.
cov: The covariance matrix of a multivariate normal random walk proposal distribution used in the MCMC, in the form of a 15 by 15 matrix

Author(s)

Ziwen An, Leah F. South and Christopher Drovandi

References

An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.

Drovandi CC, Pettitt AN (2011).“Likelihood-free Bayesian estimation of multivariate quantile distributions.”Computational Statistics & Data Analysis,55(9), 2541–2556.ISSN 0167-9473, doi:10.1016/j.csda.2011.03.019.

Examples

## Not run: require(doParallel) # You can use a different package to set up the parallel backendrequire(MASS)require(elliplot)# Loading the data for this exampledata(mgnk)model <- newModel(fnSim = mgnk_sim, fnSum = mgnk_sum, simArgs = mgnk$sim_args, theta0 = mgnk$start,    thetaNames = expression(a[1],b[1],g[1],k[1],a[2],b[2],g[2],k[2],                            a[3],b[3],g[3],k[3],delta[12],delta[13],delta[23]))# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkBSL <- bsl(mgnk$data, n = 60, M = 80000, model = model, covRandWalk = mgnk$cov,    method = "BSL", parallel = FALSE, verbose = 1L, plotOnTheFly = TRUE)stopCluster(cl)registerDoSEQ()show(resultMgnkBSL)summary(resultMgnkBSL)plot(resultMgnkBSL, which = 2, thin = 20)# Performing uBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkuBSL <- bsl(mgnk$data, n = 60, M = 80000, model = model, covRandWalk = mgnk$cov,    method = "uBSL", parallel = FALSE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultMgnkuBSL)summary(resultMgnkuBSL)plot(resultMgnkuBSL, which = 2, thin = 20)# Performing tuning for BSLassossy <- mgnk_sum(mgnk$data)lambda_all <- list(exp(seq(-2.5,0.5,length.out=20)), exp(seq(-2.5,0.5,length.out=20)),                   exp(seq(-4,-0.5,length.out=20)), exp(seq(-5,-2,length.out=20)))# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)set.seed(100)sp_mgnk <- selectPenalty(ssy, n = c(15, 20, 30, 50), lambda = lambda_all, theta = mgnk$start,    M = 100, sigma = 1.5, model = model, method = "BSL", shrinkage = "glasso", standardise = TRUE,    parallelSim = TRUE, parallelSimArgs = list(.packages = "MASS", .export = "ninenum"),    parallelMain = TRUE)stopCluster(cl)registerDoSEQ()sp_mgnkplot(sp_mgnk)# Performing BSLasso with a fixed penalty (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkBSLasso <- bsl(mgnk$data, n = 20, M = 80000, model = model, covRandWalk = mgnk$cov,    method = "BSL", shrinkage = "glasso", penalty = 0.3, standardise = TRUE, parallel = FALSE,    verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultMgnkBSLasso)summary(resultMgnkBSLasso)plot(resultMgnkBSLasso, which = 2, thin = 20)# Performing semiBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkSemiBSL <- bsl(mgnk$data, n = 60, M = 80000, model = model, covRandWalk = mgnk$cov,    method = "semiBSL", parallel = FALSE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultMgnkSemiBSL)summary(resultMgnkSemiBSL)plot(resultMgnkSemiBSL, which = 2, thin = 20)# Plotting the results together for comparison# plot using the R default plot functionoldpar <- par()par(mar = c(4, 4, 1, 1), oma = c(0, 1, 2, 0))combinePlotsBSL(list(resultMgnkBSL, resultMgnkuBSL, resultMgnkBSLasso, resultMgnkSemiBSL),                which = 1, thin = 20, label = c("bsl", "ubsl", "bslasso", "semiBSL"),                col = c("red", "yellow", "blue", "green"), lty = 2:5, lwd = 1)mtext("Approximate Univariate Posteriors", outer = TRUE, line = 0.75, cex = 1.2)# plot using the ggplot2 packagecombinePlotsBSL(list(resultMgnkBSL, resultMgnkuBSL, resultMgnkBSLasso, resultMgnkSemiBSL),    which = 2, thin = 20, label=c("bsl","ubsl","bslasso","semiBSL"),    options.color=list(values=c("red","yellow","blue","green")),    options.linetype = list(values = 2:5), options.size = list(values = rep(1, 4)),    options.theme = list(plot.margin = grid::unit(rep(0.03,4),"npc"),        axis.title = ggplot2::element_text(size=12), axis.text = ggplot2::element_text(size = 8),        legend.text = ggplot2::element_text(size = 12)))par(mar = oldpar$mar, oma = oldpar$oma)## End(Not run)

Progress Bar

Description

Print a customisable progress bar in the console.

Usage

myMiniProgressBar(p, txt1 = "", txt2 = "", style = 1, label = c("=", "-", "|"))

Arguments

p

Numeric, percentage of finished progress, between 0 and 1.

txt1

String to put before the progress bar

txt2

String to put after the progress bar

style

The display style. 1 is single-lined; 2 is double-lined; 3 display the progress in a 5-lined block.

label

Character labels for "finished", "un-finished", and "side bars".

Value

No return value, called for side effects.

Convert an observation matrix to a vector of n-day displacements

Description

Convert an observation matrix to a vector of n-daydisplacements. This is a function for the toad example.

Usage

obsMat2deltax(X, lag)

Arguments

X

The observation matrix to be converted.

lag

Interger, the number of day lags to compute the displacement.

Value

A vector of displacements.

Generate a random sample from the zero-centered stable distribution

Description

Draw a sample from a symmetric, zero-centered stable distribution withgiven scale and stability (alpha) parameters, using the CMS algorithm.

Usage

rstable(scale, alpha)

Arguments

scale

The scale parameter.

alpha

The stability parameter.

Value

A random sample from the zero-centered stable distribution.

Selecting the Penalty Parameter

Description

This is the main function for selecting the shrinkage (graphicallasso or Warton's estimator) penalty parameter for method BSL or semiBSLbased on a point estimate of the parameters. Parallel computing issupported with the R packageforeach. The penalty selection methodis outlined in An et al. (2019).

Usage

selectPenalty(  ssy,  n,  lambda,  M,  sigma = 1.5,  model,  theta = NULL,  method = c("BSL", "semiBSL"),  shrinkage = c("glasso", "Warton"),  parallelSim = FALSE,  parallelSimArgs = NULL,  parallelMain = FALSE,  verbose = 1L,  ...)

Arguments

ssy

A summary statistic vector for the observeddata.

n

A vector of possible values ofn, thenumber of simulations from the model per MCMC iteration for estimating thesynthetic likelihood.

lambda

A list, with each entry containing the vector ofpenalty values to test for the corresponding choice ofn.

M

The number of repeats to use in estimating thestandard deviation of the estimated log synthetic likelihood.

sigma

The standard deviation of the log syntheticlikelihood estimator to aim for, usually a value between 1 and 2. Thisparameter helps to control the mixing of a Markov chain.

model

A “MODEL” object generated with functionnewModel. SeenewModel.

theta

A point estimate of the parameter value whichall of the simulations will be based on. By default, iftheta isNULL, it will be replaced bytheta0 from the givenmodel.

method

A string argument indicating the method to beused. If the method is “BSL”, the shrinkage is applied to the Gaussiancovariance matrix. Otherwise if the method is “semiBSL”, the shrinkage isapplied to the correlation matrix of the Gaussian copula.

shrinkage

A string argument indicating which shrinkage method tobe used. Current options are “glasso” for the graphical lasso method ofFriedman et al. (2008) and “Warton” for the ridgeregularisation method of Warton (2008).

parallelSim

A logical value indicating whether parallelcomputing should be used for simulation and summary statistic evaluation.Default isFALSE.

parallelSimArgs

A list of additional arguments to pass into theforeach function. Only used whenparallelSim isTRUE,default isNULL.

parallelMain

A logical value indicating whether parallelcomputing should be used to computing the graphical lasso function. Noticethat this should only be turned on when there are a lot of candidate valuesinlambda. Default isFALSE.

verbose

...

Other arguments to pass togaussianSynLike (“BSL”method) orsemiparaKernelEstimate (“semiBSL” method).

Value

An S4 objectPENALTY of the penalty selection results. Theshow andplot methods are provided with the S4 class.

Author(s)

Ziwen An, Leah F. South and Christopher Drovandi

References

An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.

Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.

Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.

Examples

## Not run: data(ma2)model <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args,                  theta0 = ma2$start, fnLogPrior = ma2_prior)theta <- c(0.6,0.2)# Performing tuning for BSLasso (BSL with glasso shrinkage estimation)ssy <- ma2_sum(ma2$data)lambda_all <- list(exp(seq(-3,0.5,length.out=20)), exp(seq(-4,-0.5,length.out=20)),                   exp(seq(-5.5,-1.5,length.out=20)), exp(seq(-7,-2,length.out=20)))set.seed(100)sp_ma2 <- selectPenalty(ssy = ssy, n = c(50, 150, 300, 500), lambda_all, theta = theta,    M = 100, sigma = 1.5, model = model, method = 'BSL', shrinkage = 'glasso')sp_ma2plot(sp_ma2)## End(Not run)

Estimate the semi-parametric synthetic (log) likelihood

Description

This function computes the semi-parametric synthetic likelihoodestimator of (An et al. 2019). The advantage of thissemi-parametric estimator over the standard synthetic likelihood estimatoris that the semi-parametric one is more robust to non-normal summarystatistics. Kernel density estimation is used for modelling each univariatemarginal distribution, and the dependence structure between summaries arecaptured using a Gaussian copula. Shrinkage on the correlation matrixparameter of the Gaussian copula is helpful in decreasing the number ofsimulations.

Usage

semiparaKernelEstimate(  ssy,  ssx,  kernel = "gaussian",  shrinkage = NULL,  penalty = NULL,  log = TRUE)

Arguments

ssy

The observed summary statisic.

ssx

A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration.

kernel

A string argument indicating the smoothing kernel to passintodensity for estimating the marginal distribution of eachsummary statistic. Only “gaussian" and “epanechnikov" are available. Thedefault is “gaussian".

shrinkage

A string argument indicating which shrinkage method tobe used. The default isNULL, which means no shrinkage is used.Current options are “glasso” for the graphical lasso method ofFriedman et al. (2008) and “Warton” for the ridgeregularisation method of Warton (2008).

penalty

The penalty value to be used for the specified shrinkagemethod. Must be between zero and one if the shrinkage method is “Warton”.

log

A logical argument indicating if the log of likelihood isgiven as the result. The default isTRUE.

Value

The estimated synthetic (log) likelihood value.

References

An Z, Nott DJ, Drovandi C (2019).“Robust Bayesian Synthetic Likelihood via a Semi-Parametric Approach.”Statistics and Computing (In Press).

Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.

Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.

Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.

Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.

Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.

Examples

data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args,              theta0 = ma2$start, sumArgs = list(delta = 0.5))ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssx# check the distribution of the first summary statistic: highly non-normalplot(density(ssx[, 1]))# the standard synthetic likelihood estimator over-estimates the likelihood heregaussianSynLike(ssy, ssx)# the semi-parametric synthetic likelihood estimator is more robust to non-normalitysemiparaKernelEstimate(ssy, ssx)# using shrinkage on the correlation matrix of the Gaussian copula is also possiblesemiparaKernelEstimate(ssy, ssx, shrinkage = "Warton", penalty = 0.8)

The simulation function for the toad example

Description

The simulation function for the toad example.

Usage

sim_toad(params, ntoad, nday, model = 1L, d0 = 100)

Arguments

params

A vector of proposed model parameters,\alpha,gamma andp_0.

ntoad

The number of toads to simulate in the observation.

nday

The number of days lasted of the observation.

model

Which model to be used. 1 for the random return model, 2 for the nearest return model,and 3 for the distance-based return probability model.

d0

Characteristic distance for model 3. Only used if model is 3.

Value

A data matrix.

Examples

sim_toad(c(1.7,36,0.6), 10, 8, 1)

Simulation function of the cell biology example

Description

Simulation function of the cell biology example.

Usage

simulate_cell(x, rows, cols, Pm, Pp, sim_iters, num_obs)

Arguments

x

The initial matrix of cell presences of sizerows\timescols.

rows

The number of rows in the lattice (rows in the cell locationmatrix).

cols

The number of columns in the lattice (columns in the celllocation matrix).

Pm

ParameterP_m,the probability of cell movement.

Pp

ParameterP_p,the probability of cell proliferation.

sim_iters

num_obs

The total number of images taken after initialisation.

Value

Arows\timescols\timesnum_obs arrayof the cell presences at times1:num_obs (not time 0).

Run simulations with a give "MODEL" object

Description

seeMODEL

Usage

simulation(model, ...)

Arguments

model

A “MODEL” object.

...

Other arguments.

Generate a random sample of gamma for the R-BSL-M method ofFrazier and Drovandi (2021) using slice sampling

Description

This function updates the gamma of the R-BSL-M method ofFrazier and Drovandi (2021) with a slice sampler(Neal 2003). Note this function is mainly designed forinternal usage.

Usage

sliceGammaMean(  ssy,  ssx,  loglike,  gamma = numeric(length(ssy)),  tau = 1,  w = 1,  std = NULL,  maxit = 1000)

Arguments

ssy

The observed summary statisic.

ssx

A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration.

loglike

The current log synthetic likelihood. This is computed withfunctionsynLikeMisspec with the current gamma value.

gamma

The additional latent parameter to account for possibleincompatability between the model and observed summary statistic. In“BSLmisspec” method, this is updated with a slice sampler(Neal 2003). The default gamma implies no model misspecificationand is equivalent to the standardgaussianSynLike estimator.

tau

Scale (or inverse rate) parameter of the Laplace priordistribution for gamma.

w

Step size for the stepping out in the slice sampler. The defaultstep size is 1.

std

Standard deviation of the columns of ssx. If this is notNULL, it will be used to save computation effort.

maxit

The maximum number of iteration of the stepping out and shrinksteps of slice sampler. The default is 1e3.

References

Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.

Neal RM (2003).“Slice sampling.”The Annals of Statistics,31(3), 705–767.

Generate a random sample of gamma for the R-BSL-V method ofFrazier and Drovandi (2021) using slice sampling

Description

This function updates the gamma parameter with a slice sampler(Neal 2003). The target distribution is the varianceinflated approximate posterior of BSL with model misspecification. SeeFrazier and Drovandi (2021). Note this function is mainlydesigned for internal usage.

Usage

sliceGammaVariance(  ssy,  ssx,  loglike,  gamma = numeric(length(ssy)),  tau = 1,  w = 1,  std = NULL,  maxit = 1000)

Arguments

ssy

The observed summary statisic.

ssx

A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration.

loglike

The current log synthetic likelihood. This is computed withfunctionsynLikeMisspec with the current gamma value.

gamma

tau

Numeric. Scale (or inverse rate) parameter of the exponentialprior distribution.

w

Step size for the stepping out in the slice sampler. The defaultstep size is 1.

std

Standard deviation of the columns of ssx. If this is notNULL, it will be used to save computation effort.

maxit

The maximum number of iteration of the stepping out and shrinksteps of slice sampler. The default is 1e3.

References

Compute the summary statistics with the given data

Description

seeMODEL

Usage

summStat(x, model)

Arguments

x

The data to pass to the summary statistics function.

model

A “MODEL” object.

Estimate the Gaussian synthetic (log) likelihood whilst acknowledging modelincompatibility

Description

This function estimates the Gaussian synthetic likelihood whilstacknowledging that there may be incompatibility between the model and theobserved summary statistic. The method has two different ways toaccount for and detect incompatibility (mean adjustment and varianceinflation). An additional free parametergamma is employed to account for themodel misspecification. See the R-BSL methods ofFrazier and Drovandi (2021) for more details. Note this functionis mainly designed for interal use as the latent variablegamma needto be chosen otherwise. Alternatively,gamma is updated with a slicesampler (Neal 2003), which is the method ofFrazier and Drovandi (2021).

Usage

synLikeMisspec(  ssy,  ssx,  type = c("mean", "variance"),  gamma = numeric(length(ssy)),  log = TRUE,  verbose = FALSE)

Arguments

ssy

The observed summary statisic.

ssx

A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration.

type

A string argument indicating which method is used to account forand detect potential incompatibility. The two options are "mean" and"variance" for mean adjustment and variance inflation, respectively.

gamma

log

A logical argument indicating if the log of likelihood isgiven as the result. The default isTRUE.

verbose

A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault isFALSE.

Value

The estimated synthetic (log) likelihood value.

References

Examples

# a toy model (for details see section 4.1 from Frazier et al 2019)# the true underlying model is a normal distribution with standard deviation equals to 0.2# whist the data generation process has the standard deviation fixed to 1set.seed(1)y <- rnorm(50, 1, sd = 0.2)ssy <- c(mean(y), var(y))m <- newModel(fnSim = function(theta) rnorm(50, theta), fnSum = function(x) c(mean(x), var(x)),              theta0 = 1, fnLogPrior = function(x) log(dnorm(x, sd = sqrt(10))))ssx <- simulation(m, n = 300, theta = 1, seed = 10)$ssx# gamma is updated with a slice samplergamma <- rep(0.1, length(ssy))synLikeMisspec(ssy, ssx, type = "variance", gamma = gamma)

Toad example

Description

This example estimates the parameter for the toad example. Themodel simulates the movement of an amphibian called Fowler's toad. Themodel is proposed by Marchand et al. (2017). This exampleincludes both simulated and real data. The real data is obtained from the supplementary material of Marchand et al. (2017).The journal article An et al. (2022) provides a fulldescription of how to use this package for the toad example.

Usage

data(toad)toad_sim(  theta,  ntoads,  ndays,  model = 1,  d0 = 100,  na = matrix(FALSE, ndays, ntoads))toad_sum(X, lag = c(1, 2, 4, 8), p = seq(0, 1, 0.1))toad_prior(theta)

Arguments

theta

A vector of proposed model parameters,\alpha,\gamma andp_0.

ntoads

The number of toads to simulate in the observation.

ndays

The number of days observed.

model

Which model to be used: 1 for the random return model, 2 for thenearest return model, and 3 for the distance-based return probabilitymodel. The default is 1.

d0

Characteristic distance for model 3. Only used ifmodel is3.

na

Logical. This is the index matrix for missing observations. Bydefault,matrix(FALSE, ndays, ntoads) indicates there is nomissingness in the observation matrix.

X

The data matrix.

lag

The lag of days to compute the summary statistics, default as 1,2, 4 and 8.

p

The numeric vector of probabilities to compute the quantiles.

Details

The example includes the three different returning models ofMarchand et al. (2017). Please seeMarchand et al. (2017) for a full description of the toadmodel, and also An et al. (2019) for Bayesian inferencewith the semi-BSL method.

Functions

toad_sim: Simulates data from the model, using C++ in the backend.
toad_sum: Computes the summary statistics for this example. The summary statistics are the log differences between adjacent quantiles and also the median.
toad_prior: Evaluates the log prior at the chosen parameters.

datasets (simulated and real)

A simulated dataset and a real dataset are provided in this example. Bothdatasets contain observations from 66 toads for 63 days. The simulateddataset is simulated with parameter\theta = (1.7, 35, 0.6). This is the data used in An et al. (2019). The realdataset is obtained from the supplementary data ofMarchand et al. (2017).

data_simulated: A 63\times 66 matrix of the observedtoad locations (simulated data).
data_real: A 63\times 66 matrix of the observedtoad locations (real data).
cov: The covariance matrix of a multivariate normal randomwalk proposal distribution used in the MCMC, in the form of a 3\times 3 matrix.
theta0: A vector of suitable initial values of the parametersfor MCMC.
sim_args_simulated andsim_args_real: A list of thearguments to pass into the simulation function.
- ndays: The number of days observed.
- ntoads: The total number of toads being observed.
- model: Indicator of which model to be used.
- na: Indicator matrix for missingness.

Author(s)

Ziwen An, Leah F. South andChristopher Drovandi

References

Examples

## Not run: require(doParallel) # You can use a different package to set up the parallel backenddata(toad)## run standard BSL for the simulated datasetmodel1 <- newModel(fnSim = toad_sim, fnSum = toad_sum, theta0 = toad$theta0,                   fnLogPrior = toad_prior, simArgs = toad$sim_args_simulated,                    thetaNames = expression(alpha,gamma,p[0]))paraBound <- matrix(c(1,2,0,100,0,0.9), 3, 2, byrow = TRUE)# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultToadSimulated <- bsl(toad$data_simulated, n = 1000, M = 10000, model = model1,                           covRandWalk = toad$cov, logitTransformBound = paraBound,                           parallel = TRUE, verbose = 1L, plotOnTheFly = 100)stopCluster(cl)registerDoSEQ()show(resultToadSimulated)summary(resultToadSimulated)plot(resultToadSimulated, thetaTrue = toad$theta0, thin = 20)## run standard BSL for the real datasetmodel2 <- newModel(fnSim = toad_sim, fnSum = toad_sum, theta0 = toad$theta0,                   fnLogPrior = toad_prior, simArgs = toad$sim_args_real,                   thetaNames = expression(alpha,gamma,p[0]))paraBound <- matrix(c(1,2,0,100,0,0.9), 3, 2, byrow = TRUE)# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultToadReal <- bsl(toad$data_real, n = 1000, M = 10000, model = model2,                      covRandWalk = toad$cov, logitTransformBound = paraBound,                      parallel = TRUE, verbose = 1L, plotOnTheFly = 100)stopCluster(cl)registerDoSEQ()show(resultToadReal)summary(resultToadReal)plot(resultToadReal, thetaTrue = toad$theta0, thin = 20)## End(Not run)

Movatterモバイル変換

Bayesian synthetic likelihood

Description

Author(s)

References

S4 class “BSL”.

Description

Usage

Arguments

Slots

Examples

S4 class “MODEL”

Description

Usage

Arguments

Value

Slots

Examples

S4 class “PENALTY”

Description

Usage

Arguments

Slots

See Also

Performing BSL, uBSL, semiBSL and BSLmisspec

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Cell biology example

Description

Usage

Arguments

Details

Functions

A simulated dataset

Author(s)

References

Examples

Plot the densities of multiple “bsl” class objects.

Description

Usage

Arguments

Value

See Also

Examples

Convert a correlation matrix to a covariance matrix

Description

Usage

Arguments

Value

Estimate the synthetic likelihood

Description

Usage

Arguments

Value

References

See Also

Examples

Estimate the Whitening matrix to be used in the “wBSL” method ofPriddle et al. (2021)

Description

Usage

Arguments

Value

References

Examples

Gaussian rank correlation

Description

Usage

Arguments

Value

References

See Also

Examples

Estimate the Gaussian synthetic (log) likelihood

Description