| Type: | Package |
| Title: | Bayesian Synthetic Likelihood |
| Version: | 3.2.5 |
| Date: | 2022-11-02 |
| Description: | Bayesian synthetic likelihood (BSL, Price et al. (2018) <doi:10.1080/10618600.2017.1302882>) is an alternative to standard, non-parametric approximate Bayesian computation (ABC). BSL assumes a multivariate normal distribution for the summary statistic likelihood and it is suitable when the distribution of the model summary statistics is sufficiently regular. This package provides a Metropolis Hastings Markov chain Monte Carlo implementation of four methods (BSL, uBSL, semiBSL and BSLmisspec) and two shrinkage estimators (graphical lasso and Warton's estimator).uBSL (Price et al. (2018) <doi:10.1080/10618600.2017.1302882>) uses an unbiased estimator to the normal density. A semi-parametric version of BSL (semiBSL, An et al. (2018) <doi:10.48550/arXiv.1809.05800>) is more robust to non-normal summary statistics. BSLmisspec (Frazier et al. 2019 <doi:10.48550/arXiv.1904.04551>) estimates the Gaussian synthetic likelihood whilst acknowledging that there may be incompatibility between the model and the observed summary statistic. Shrinkage estimation can help to decrease thenumber of model simulations when the dimension of the summary statistic is high (e.g., BSLasso, An et al. (2019) <doi:10.1080/10618600.2018.1537928>). Extensions to this package are planned. For a journal article describing howto use this package, see An et al. (2022) <doi:10.18637/jss.v101.i11>. |
| Depends: | R (≥ 3.3.0) |
| License: | GPL-2 |GPL-3 [expanded from: GPL (≥ 2)] |
| LazyLoad: | yes |
| Imports: | glasso, ggplot2, MASS, mvtnorm, copula, whitening, graphics,gridExtra, foreach, coda, Rcpp, doRNG, methods, stringr, Rdpack(≥ 0.7) |
| Suggests: | elliplot, doParallel, rbenchmark, mixtools |
| LinkingTo: | RcppArmadillo, Rcpp |
| LazyData: | true |
| RoxygenNote: | 7.1.1 |
| Encoding: | UTF-8 |
| Collate: | 'BSL-package.R' 'RcppExports.R' 's4-MODEL.R' 's4-BSL.R''bsl.R' 'cell.R' 'combinePlotsBSL.R' 'covWarton.R''estimateLoglike.R' 'estimateWhiteningMatrix.R''gaussianRankCorr.R' 'gaussianSynLike.R''gaussianSynLikeGhuryeOlkin.R' 'imports.R' 'kernelCDF.R''logitTransform.R' 'ma2.R' 'mgnk.R' 'myMiniProgressBar.R''s4-PENALTY.R' 'selectPenalty.R' 'semiparaKernelEstimate.R''sliceGammaMean.R' 'sliceGammaVariance.R' 'synLikeMisspec.R''toad.R' |
| RdMacros: | Rdpack |
| NeedsCompilation: | yes |
| Packaged: | 2022-11-02 05:28:19 UTC; southl |
| Author: | Ziwen An |
| Maintainer: | Leah F. South <l1.south@qut.edu.au> |
| Repository: | CRAN |
| Date/Publication: | 2022-11-03 09:00:07 UTC |
Bayesian synthetic likelihood
Description
Bayesian synthetic likelihood (BSL,Price et al. (2018)) is an alternative to standard,non-parametric approximate Bayesian computation (ABC). BSL assumes amultivariate normal distribution for the summary statistic likelihood and itis suitable when the distribution of the model summary statistics issufficiently regular.
In this package, a Metropolis Hastings Markov chain Monte Carlo (MH-MCMC)implementation of BSL is available. We also include implementations of fourmethods (BSL, uBSL, semiBSL and BSLmisspec) and two shrinkage estimators(graphical lasso and Warton's estimator).
Methods: (1) BSL (Price et al. 2018), which is the standard form ofBayesian synthetic likelihood, assumes the summary statistic is roughlymultivariate normal; (2) uBSL (Price et al. 2018), which uses anunbiased estimator to the normal density; (3) semiBSL(An et al. 2019), which relaxes the normality assumption to anextent and maintains the computational advantages of BSL without any tuning;and (4) BSLmisspec (Frazier and Drovandi 2021), which estimates theGaussian synthetic likelihood whilst acknowledging that there may beincompatibility between the model and the observed summary statistic.
Shrinkage estimators are designed particularly to reduce the number ofsimulations if method is BSL or semiBSL: (1) graphical lasso(Friedman et al. 2008) finds a sparse precision matrix with anL1-regularised log-likelihood. An et al. (2019) usegraphical lasso within BSL to bring down the number of simulationssignificantly when the dimension of the summary statistic is high; and (2)Warton's estimator (Warton 2008) penalises the correlationmatrix and is straightforward to compute. When using the Warton's shrinkageestimator, it is also possible to utilise the Whitening transformation(Kessy et al. 2018) to help decorrelate the summary statsitics, thusencouraging sparsity of the synthetic likelihood covariance matrix.
Parallel computing is supported through theforeach package and userscan specify their own parallel backend by using packages likedoParallel ordoMC. Then model simulations required toestimate the synthetic likelihood at each iteration of MCMC will bedistributed across multiple cores. Alternatively a vectorised simulationfunction that simultaneously generatesn model simulations is alsosupported.
The main functionality is available through:
bsl: The general function to perform BSL,uBSL, or semiBSL (with or without parallel computing).selectPenalty: A function to select the penalty when usingshrinkage estimation within BSL or semiBSL.
Several examples have also been included. These examples can be used toreproduce the results of An et al. (2019), and can help practitioners learnhow to use the package.
ma2: The MA(2) example fromAn et al. (2019).mgnk: The multivariate G&K example fromAn et al. (2019).cell: The cell biology example fromPrice et al. (2018) and An et al. (2019).toad: The toad example fromMarchand et al. (2017), and also considered inAn et al. (2019).
Extensions to this package are planned. For a journal article describing howto use this package, including full descriptions on the MA(2) and toad examples,see An et al. (2022).
Author(s)
Ziwen An, Leah F. South and Christopher Drovandi
References
An Z, Nott DJ, Drovandi C (2019).“Robust Bayesian Synthetic Likelihood via a Semi-Parametric Approach.”Statistics and Computing (In Press).
An Z, South LF, Drovandi CC (2022).“BSL: An R Package for Efficient Parameter Estimation for Simulation-Based Models via Bayesian Synthetic Likelihood.”Journal of Statistical Software,101(11), 1–33.doi:10.18637/jss.v101.i11.
An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.
Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.
Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.
Kessy A, Lewin A, Strimmer K (2018).“Optimal Whitening and Decorrelation.”The American Statistician,72(4), 309–314.doi:10.1080/00031305.2016.1277159.
Marchand P, Boenke M, Green DM (2017).“A stochastic movement model reproduces patterns of site fidelity and long-distance dispersal in a population of Fowlers toads (Anaxyrus fowleri).”Ecological Modelling,360, 63–69.ISSN 0304-3800, doi:10.1016/j.ecolmodel.2017.06.025.
Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.
Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.
S4 class “BSL”.
Description
The S4 class “BSL” is produced by running functionbsl and contains the result of a BSL run. Basic S4 methodsshow,summary andplot are provided.theta andloglike returns the MCMC samples of parameter values and estimatedlog-likelihoods.
Usage
## S4 method for signature 'BSL'show(object)## S4 method for signature 'BSL'summary(object, burnin = 0, thetaNames = NULL)## S4 method for signature 'BSL,ANY'plot( x, which = 1L, thin = 1, burnin = 0, thetaTrue = NULL, options.plot = NULL, top = "Approximate Univariate Posteriors", options.density = list(), options.theme = list())## S4 method for signature 'BSL'getTheta(object, burnin = 0, thin = 1)## S4 method for signature 'BSL'getLoglike(object, burnin = 0, thin = 1)## S4 method for signature 'BSL'getGamma(object, burnin = 0, thin = 1)Arguments
object | A “BSL” class object. |
burnin | the number of MCMC burn-in steps to be taken. |
thetaNames | Parameter names to be shown in the summary table. If notgiven, parameter names of the “BSL” object will be used by default. |
x | A “BSL” class object to plot. |
which | An integer argument indicating which plot function to beused. The default, |
thin | A numeric argument indicating the gap between samples tobe taken when thinning the MCMC draws. The default is |
thetaTrue | A set of true parameter values to be included on the plotsas a reference line. The default is |
options.plot | A list of additional arguments to pass into the |
top | A character argument of the combined plot title if |
options.density | A list of additional arguments to pass into the |
options.theme | A list of additional arguments to pass into the |
Slots
thetaObject of class “matrix”. MCMC samples from the jointapproximate posterior distribution of the parameters.
loglikeObject of class “numeric”. Accepted MCMC samples of theestimated log-likelihood values.
callObject of class “call”. The original code that was used to callthe method.
modelObject of class “MODEL”.
acceptanceRateObject of class “numeric”. The acceptance rate of theMCMC algorithm.
earlyRejectionRateObject of class “numeric”. The early rejectionrate of the algorithm (early rejection may occur when using bounded priordistributions).
errorRateObject of class “numeric”. The error rate. If any infinitesummary statistic or infinite log-likelihood estimate occurs during theprocess, it is marked as an error and the proposed parameter will berejected.
yObject of class “ANY”. The observed data.
nObject of class “numeric”. The number of simulations from the modelper MCMC iteration to estimate the synthetic likelihood.
MObject of class “numeric”. The number of MCMC iterations.
covRandWalkObject of class “matrix”. The covariance matrix used inmultivariate normal random walk proposals.
methodObject of class “character”. The character argument indicatingthe used method.
shrinkageObject of class “characterOrNULL”. The character argumentindicating the shrinkage method.
penaltyObject of class “numericOrNULL”. The penalty value.
GRCObject of class “logical”. Whether the Gaussian rank correlationmatrix is used.
logitTransformObject of class “logical”. The logical argumentindicating whether a logit transformation is used in the algorithm.
logitTransformBoundObject of class “matrixOrNULL”. The matrix oflogitTransformBound.
standardiseObject of class “logical”. The logical argument thatdetermines whether to standardise the summary statistics.
parallelObject of class “logical”. The logical value indicatingwhether parallel computing is used in the process.
parallelArgsObject of class “listOrNULL”. The list of additionalarguments to pass into the
foreachfunction.timeObject of class “difftime”. The running time.
gammaObject of class “numeric”. MCMC samples of gamma parametervalues of the mean adjustment or variance inflation for method“BSLmisspec”.
misspecTypeObject of class “characterOrNULL”. The character argumentindicating whether mean adjustment ("mean") or variance inflation("variance") to be used in "BSLmisspec" method.
tauObject of class “numeric”. Parameter of the prior distributionfor "BSLmisspec" method. For mean adjustment,
tauis the scale ofthe Laplace distribution. For variance inflation,tauis the mean ofthe exponential distribution.whiteningObject of class “logicalOrMatrixOrNULL”. A logical argumentdetermines whether Whitening transformation is used in “BSL” method withWarton's shrinkage, or just the Whitening matrix used.
Examples
## Not run: # a toy exampletoy_simVec <- function(n, theta) matrix(rnorm(n, theta), nrow = n) # the simulation functiontoy_sum <- function(x) x # the summary statistic functionmodel <- newModel(fnSimVec = toy_simVec, fnSum = toy_sum, theta0 = 0) # create the model objectresult_toy <- bsl(y = 1, n = 100, M = 1e4, model = model, covRandWalk = matrix(1))summary(result_toy)plot(result_toy)## End(Not run)S4 class “MODEL”
Description
The S4 class contains the simulation and summary statisticsfunction and other necessary arguments for a model to run in the mainbsl function.
newModel is the constructor function for aMODELobject.
simulation runs a number of simulations and computes thecorreponding summary statistics with the provided model.
summStat computes the summary statistics with the given data and model object.The summary statistics function and relevant arguments are obtained from the model.
Usage
newModel( fnSim, fnSimVec, fnSum, fnLogPrior, simArgs, sumArgs, theta0, thetaNames, test = TRUE, verbose = TRUE)## S4 method for signature 'MODEL'simulation( model, n = 1, theta = model@theta0, summStat = TRUE, parallel = FALSE, parallelArgs = NULL, seed = NULL)## S4 method for signature 'ANY,MODEL'summStat(x, model)Arguments
fnSim | A function that simulates data for a given parametervalue. The first argument should be the parameters. Other necessaryarguments (optional) can be specified with |
fnSimVec | A vectorised function that simulates a number ofdatasets simultaneously for a given parameter value. The first twoarguments should be the number of simulations to run and parameters,respectively. Other necessary arguments (optional) can be specified with |
fnSum | A function for computing summary statistics of data. Thefirst argument should be the observed or simulated dataset. Other necessaryarguments (optional) can be specified with |
fnLogPrior | A function that computes the log of prior density for aparameter. If this is missing, the prior by default is an improper flatprior over the real line for each parameter. The function must have asingle input: a vector of parameter values. |
simArgs | A list of additional arguments to pass into the simulationfunction. Only use when the input |
sumArgs | A list of additional arguments to pass into the summarystatistics function. Only use when the input |
theta0 | Initial guess of the parameter value. |
thetaNames | A string vector of parameter names, which must have thesame length as the parameter vector. |
test | Logical, whether a short simulation test will be ranupon initialisation. |
verbose | Logical, whether to print verbose messages wheninitialising a “MODEL” object. |
model | A “MODEL” class object. |
n | The number of simulations to run. |
theta | The parameter value. |
summStat | Logical indicator whether the correpsonding summary statisticsshould be returned or not. The default is |
parallel | A logical value indicating whether parallel computing shouldbe used for simulation and summary statistic evaluation. The default is |
parallelArgs | A list of additional arguments to pass into the |
seed | A seed number to pass to the |
x | The data to pass to the summary statistics function. |
Value
A list of simulation results using the given parameter.xcontains the raw simulated datasets.ssx contains the summarystatistics.
A vector of the summary statistics.
Slots
fnSimA function that simulates data for a given parameter value. Thefirst argument should be the parameters. Other necessary arguments(optional) can be specified with
simArgs.fnSimVecA vectorised function that simulates a number of datasetssimultaneously for a given parameter value. If this is not
NULL,vectorised simulation function will be used instead offnSim. Thefirst two arguments should be the number of simulations to run andparameters, respectively. Other necessary arguments (optional) can bespecified withsimArgs. The output must be a list of each simulationresult.fnSumA function for computing summary statistics of data. The firstargument should be the observed or simulated dataset. Other necessaryarguments (optional) can be specified with
sumArgs. The users shouldcode this function carefully so the output have fixed length and nevercontain anyInfvalue.fnLogPriorA function that computes the log of prior density for aparameter. The default is
NULL, which uses an improper flat priorover the real line for each parameter. The function must have a singleinput: a vector of parameter values.simArgsA list of additional arguments to pass into the simulationfunction. Only use when the input
fnSimorfnSimVecrequiresadditional arguments. The default isNULL.sumArgsA list of additional arguments to pass into the summarystatistics function. Only use when the input
fnSumrequiresadditional arguments. The default isNULL.theta0Initial guess of the parameter value, which is used as thestarting value for MCMC.
thetaNamesExpression, parameter names.
nsThe number of summary statistics of a single observation. Note thiswill be generated automatically, thus is not required for initialisation.
testLogical, whether a short simulation test will be ran uponinitialisation.
verboseLogical, whether to print verbose messages when initialising a“MODEL” object.
Examples
# set up the model for the ma2 exampledata(ma2)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start, fnLogPrior = ma2_prior, verbose = FALSE)validObject(m)# benchmark the serial and vectorised simulation function (require the rbenchmark package)m1 <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start, fnLogPrior = ma2_prior)m2 <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start, fnLogPrior = ma2_prior)require("rbenchmark")## Not run: benchmark(serial = simulation(m1, n = 1000, theta = c(0.6, 0.2)), vectorised = simulation(m2, n = 1000, theta = c(0.6, 0.2)))## End(Not run)S4 class “PENALTY”
Description
This S4 class contains the penalty selection result fromfunctionselectPenalty.show display the penaltyselection result.plot plot the penalty selection result usingggplot2.
Usage
## S4 method for signature 'PENALTY'show(object)## S4 method for signature 'PENALTY,ANY'plot(x, logscale = TRUE)## S4 method for signature 'BSL'getPenalty(object)Arguments
object | The S4 object of class “PENALTY” to show. |
x | The S4 object of class “PENALTY” to plot. |
logscale | A logical argument whether the x-axis (penalty) should be log transformed. Thedefault is |
Slots
loglikeA list of the log-likelihood values. The list contains multiplematrices (each corresponds to the result for a specific n value). Thenumber of row of the matrix equals to the number of repeats
M. Thecolumns of the matrix stands for different penalty values.nA vector of
n, the number of simulations from the model perMCMC iteration for estimating the synthetic likelihood.lambdaA list, with each entry containing the vector of penalty valuesfor the corresponding choice of
n.MThe number of repeats used in estimating the standard deviation ofthe estimated log synthetic likelihood.
sigmaThe standard deviation of the log synthetic likelihood estimatorto aim for, usually a value between 1 and 2. This reflects the mixing of aMarkov chain.
modelA “MODEL” object generated with function
newModel.SeenewModel.stdLoglikeA list contains the estimated standard deviations oflog-likelihoods.
penaltyThe vector stores the selected penalty values for each given
nby choosing from the candidatelambdalist. The selectedvalues produce closest standard deviationsstdLogliketo the targetsigma.resultThe result data frame.
callThe original code used to run
selectPenalty.
See Also
selectPenalty for the function that selects thepenalty parameter.
Performing BSL, uBSL, semiBSL and BSLmisspec
Description
This is the main function for performing MCMC BSL (with astandard or non-standard likelihood estimator) to sample from theapproximate posterior distribution. A couple of extentions to the standardapproach are available by changing the following arguments,method,shrinkage,whitening,misspecType. Parallel computingis supported with the R packageforeach.
Usage
bsl( y, n, M, model, covRandWalk, theta0, fnSim, fnSum, method = c("BSL", "uBSL", "semiBSL", "BSLmisspec"), shrinkage = NULL, penalty = NULL, fnPrior = NULL, simArgs = NULL, sumArgs = NULL, logitTransformBound = NULL, standardise = FALSE, GRC = FALSE, whitening = NULL, misspecType = NULL, tau = 1, parallel = FALSE, parallelArgs = NULL, thetaNames = NULL, plotOnTheFly = FALSE, verbose = 1L)Arguments
y | The observed data. Note this should be the raw dataset NOT theset of summary statistics. |
n | The number of simulations from the model per MCMC iteration forestimating the synthetic likelihood. |
M | The number of MCMC iterations. |
model | A “MODEL” object generated with function |
covRandWalk | The covariance matrix of a multivariate normal random walkproposal distribution used in the MCMC. |
theta0 | Deprecated, will be removed in the future, use |
fnSim | Deprecated, will be removed in the future, use |
fnSum | Deprecated, will be removed in the future, use |
method | A string argument indicating the method to be used. Thedefault, “BSL”, runs standard BSL. “uBSL” uses the unbiased estimatorof a normal density of Ghurye and Olkin (1969). “semiBSL”runs the semi-parametric BSL algorithm and is more robust to non-normalsummary statistics. “BSLmisspec” estimates the Gaussian syntheticlikelihood whilst acknowledging that there may be incompatibility betweenthe model and the observed summary statistic (Frazier and Drovandi 2021). |
shrinkage | A string argument indicating which shrinkage method tobe used. The default is |
penalty | The penalty value to be used for the specified shrinkagemethod. Must be between zero and one if the shrinkage method is “Warton”. |
fnPrior | Deprecated, will be removed in the future, use |
simArgs | Deprecated, will be removed in the future, use |
sumArgs | Deprecated, will be removed in the future, use |
logitTransformBound | A |
standardise | A logical argument that determines whether to standardisethe summary statistics before applying the graphical lasso. This is onlyvalid if method is “BSL”, shrinkage is “glasso” and penalty is not |
GRC | A logical argument indicating whether the Gaussian rankcorrelation matrix (Boudt et al. 2012) should be used to estimatethe covariance matrix in “BSL” method. The default is |
whitening | This argument determines whether Whitening transformationshould be used in “BSL” method with Warton's shrinkage. Whiteningtransformation helps decorrelate the summary statistics, thus encouragingsparsity of the synthetic likelihood covariance matrix. This might allowheavier shrinkage to be applied without losing much accuracy, henceallowing the number of simulations to be reduced. By default, |
misspecType | A string argument indicating which type of modelmisspecification to be used. The two options are "mean" and "variance".Only used when method is “BSLmisspec”. The default, |
tau | A numeric argument, parameter of the prior distributionfor "BSLmisspec" method. For mean adjustment, |
parallel | A logical value indicating whether parallel computing shouldbe used for simulation and summary statistic evaluation. The default is |
parallelArgs | A list of additional arguments to pass into the |
thetaNames | Deprecated, will be removed in the future, use |
plotOnTheFly | A logical or numeric argument defining whether or by howmany iterations a posterior figure will be plotted during running. If |
verbose | An integer indicating the verbose style. 0Lmeans no verbose messages will be printed. 1L uses a custom progress bar totrack the progress. 2L prints the iteration numbers ( |
Value
An object of classbsl is returned, seeBSLfor more information of the S4 class.
Author(s)
Ziwen An, Leah F. South and Christopher Drovandi
References
Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.
Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.
Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.
Ghurye SG, Olkin I (1969).“Unbiased Estimation of Some Multivariate Probability Densities and Related Functions.”Ann. Math. Statist.,40(4), 1261–1271.
Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.
Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.
An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.
An Z, Nott DJ, Drovandi C (2019).“Robust Bayesian Synthetic Likelihood via a Semi-Parametric Approach.”Statistics and Computing (In Press).
See Also
ma2,cell,mgnk andtoad for examples.selectPenalty for a functionto tune the BSLasso tuning parameter andplot for functionsrelated to visualisation.
Examples
## Not run: # This is just a minimal test run, please see package built-in examples for more# comprehensive usages of the functiontoy_sim <- function(n, theta) matrix(rnorm(n, theta), nrow = n)toy_sum <- function(x) xmodel <- newModel(fnSimVec = toy_sim, fnSum = toy_sum, theta0 = 0)result_toy <- bsl(y = 1, n = 100, M = 1e4, model = model, covRandWalk = matrix(1), method = "BSL", plotOnTheFly = TRUE)summary(result_toy)plot(result_toy)## End(Not run)Cell biology example
Description
This example estimates the probabilities of cell motility andcell proliferation for a discrete-time stochastic model of cell spreading.We provide the data and tuning parameters required to reproduce the resultsin An et al. (2019).
Usage
data(ma2)cell_sim(theta, Yinit, rows, cols, sim_iters, num_obs)cell_sum(Y, Yinit)cell_prior(theta)Arguments
theta | A vector of proposed model parameters, |
Yinit | The initial matrix of cell presences of size |
rows | The number of rows in the lattice (rows in the cell locationmatrix). |
cols | The number of columns in the lattice (columns in the celllocation matrix). |
sim_iters | The number of discretisation steps to get to when anobservation is actually taken. For example, if observations are taken every5 minutes but the discretisation level is 2.5 minutes, then |
num_obs | The total number of images taken after initialisation. |
Y | A |
Details
Cell motility (movement) and proliferation (reproduction) causetumors to spread and wounds to heal. If we can measure cell proliferationand cell motility under different situations, then we may be able to usethis information to determine the efficacy of different medical treatments.
A common method for measuring in vitro cell movement and proliferation isthe scratch assay. Cells form a layer on an assay and, once they arecompletely covering the assay, a scratch is made to separate the cells.Images of the cells are taken until the scratch has closed up and the cellsare in contact again. Each image can be converted to a binary matrix byforming a lattice and recording the binary matrix (of sizerows\timescols) of cell presences.
The model that we consider is a random walk model with parameters for theprobability of cell movement(P_m) and the probabilityof cell proliferation(P_p) and it has notractable likelihood function. We use the vague priorsP_m \sim U(0,1)andP_p \sim U(0,1).
We have a total of 145 summary statistics, which are made up of the Hammingdistances between the binary matrices for each time point and the totalnumber of cells at the final time.
Details about the types of cells that this model is suitable for and otherinformation can be found in Price et al. (2018) andAn et al. (2019). Johnston et al. (2014)use a different ABC method and different summary statistics for a similarexample.
Functions
cell_sim: The functioncell_sim(theta, Yinit, rows, cols,sim_iters, num_obs)simulates data from the model, using C++ in thebackend.cell_sum: The functioncell_sum(Y,sum_options)calculates thesummary statistics for this example.cell_prior: The functioncell_prior(theta)evaluates the logprior density at the parameter value\theta.
A simulated dataset
An example “observed” dataset and the tuning parameters relevant to thatexample can be obtained usingdata(cell). This “observed” data isa simulated dataset withP_m = 0.35 andP_p = 0.001. The lattice has 27rows and 36cols and there arenum_obs = 144 observations after time 0(to mimic images being taken every 5 minutes for 12 hours). The simulationis based on there initially being 110 cells in the assay.
Further information about the specific choices of tuning parameters used inBSL and BSLasso can be found in An et al. (2019).
data: Therows\timescols\timesnum_obsarray of the cellpresences at times 1:144.sim_args: Values ofsim_argsrelevant to this example.sum_args: Values ofsum_argsrelevant to this example,i.e. just the value ofYinit.start: A vector of suitable initial values of the parametersfor MCMC.cov: The covariance matrix of a multivariate normal randomwalk proposal distribution used in the MCMC, in the form of a 2\times2 matrix.
Author(s)
Ziwen An, Leah F. South and Christopher Drovandi
References
An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.
Johnston ST, Simpson MJ, McElwain DLS, Binder BJ, Ross JV (2014).“Interpreting scratch assays using pair density dynamics and approximate Bayesian computation.”Open Biology,4(9), 140097.doi:10.1098/rsob.140097.
Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.
Examples
## Not run: require(doParallel) # You can use a different package to set up the parallel backend# Loading the data for this exampledata(cell)model <- newModel(fnSim = cell_sim, fnSum = cell_sum, simArgs = cell$sim_args, sumArgs = cell$sum_args, theta0 = cell$start, fnLogPrior = cell_prior, thetaNames = expression(P[m], P[p]))thetaExact <- c(0.35, 0.001)# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCellBSL <- bsl(cell$data, n = 5000, M = 10000, model = model, covRandWalk = cell$cov, parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCellBSL)summary(resultCellBSL)plot(resultCellBSL, thetaTrue = thetaExact, thin = 20)# Performing uBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCelluBSL <- bsl(cell$data, n = 5000, M = 10000, model = model, covRandWalk = cell$cov, method = "uBSL", parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCelluBSL)summary(resultCelluBSL)plot(resultCelluBSL, thetaTrue = thetaExact, thin = 20)# Performing tuning for BSLassossy <- cell_sum(cell$data, cell$sum_args$Yinit)lambda_all <- list(exp(seq(0.5,2.5,length.out=20)), exp(seq(0,2,length.out=20)), exp(seq(-1,1,length.out=20)), exp(seq(-1,1,length.out=20)))# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)set.seed(100)sp_cell <- selectPenalty(ssy, n = c(500, 1000, 1500, 2000), lambda_all, theta = thetaExact, M = 100, sigma = 1.5, model = model, method = "BSL", shrinkage = "glasso", parallelSim = TRUE, parallelMain = FALSE)stopCluster(cl)registerDoSEQ()sp_cellplot(sp_cell)# Performing BSLasso with a fixed penalty (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCellBSLasso <- bsl(cell$data, n = 1500, M = 10000, model = model, covRandWalk = cell$cov, shrinkage = "glasso", penalty = 1.3, parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCellBSLasso)summary(resultCellBSLasso)plot(resultCellBSLasso, thetaTrue = thetaExact, thin = 20)# Performing semiBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultCellSemiBSL <- bsl(cell$data, n = 5000, M = 10000, model = model, covRandWalk = cell$cov, method = "semiBSL", parallel = TRUE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultCellSemiBSL)summary(resultCellSemiBSL)plot(resultCellSemiBSL, thetaTrue = thetaExact, thin = 20)# Plotting the results together for comparison# plot using the R default plot functionoldpar <- par()par(mar = c(5, 4, 1, 2), oma = c(0, 1, 2, 0))combinePlotsBSL(list(resultCellBSL, resultCelluBSL, resultCellBSLasso, resultCellSemiBSL), which = 1, thetaTrue = thetaExact, thin = 20, label = c("bsl", "ubsl", "bslasso", "semiBSL"), col = 1:4, lty = 1:4, lwd = 1)mtext("Approximate Univariate Posteriors", outer = TRUE, cex = 1.5)par(mar = oldpar$mar, oma = oldpar$oma)## End(Not run)Plot the densities of multiple “bsl” class objects.
Description
The functioncombinePlotsBSL can be used to plot multipleBSL densities together, optionally with the true values for the parameters.
Usage
combinePlotsBSL( objectList, which = 1L, thin = 1, burnin = 0, thetaTrue = NULL, label = NULL, legendPosition = c("auto", "right", "bottom")[1], legendNcol = NULL, col = NULL, lty = NULL, lwd = NULL, cex.lab = 1, cex.axis = 1, cex.legend = 0.75, top = "Approximate Marginal Posteriors", options.color = list(), options.linetype = list(), options.size = list(), options.theme = list())Arguments
objectList | A list of “bsl” class objects. |
which | An integer argument indicating which plot function to beused. The default, |
thin | A numeric argument indicating the gap between samples tobe taken when thinning the MCMC draws. The default is |
burnin | the number of MCMC burn-in steps to be taken. |
thetaTrue | A set of true parameter values to be included on the plotsas a reference line. The default is |
label | A string vector indicating the labels to be shown inthe plot legend. The default is |
legendPosition | One of the three string arguments, “auto”, “right”or “bottom”, indicating the legend position. The default is “auto”,which automatically choose from “right” and “bottom”. Only used when |
legendNcol | An integer argument indicating the number of columns ofthe legend. The default, |
col | A vector argument containing the plotting color foreach density curve. Each element of the vector will be passed into |
lty | A vector argument containing the line type for eachdensity curve. Each element of the vector will be passed into |
lwd | A vector argument containing the line width for eachdensity curve. Each element of the vector will be passed into |
cex.lab | The magnification to be used for x and y labelsrelative to the current setting of cex. To be passed into |
cex.axis | The magnification to be used for axis annotationrelative to the current setting of cex. To be passed into |
cex.legend | The magnification to be used for legend annotationrelative to the current setting of cex. Only used when |
top | A string argument of the combined plot title. Only usedwhen |
options.color | A list of additional arguments to pass into function |
options.linetype | A list of additional arguments to pass into function |
options.size | A list of additional arguments to pass into function |
options.theme | A list of additional arguments to pass into the |
Value
No return value, called for the plots produced.
See Also
ma2,cell,mgnk andtoad for examples.
Examples
## Not run: toy_sim <- function(n, theta) matrix(rnorm(2*n, theta), nrow = n)toy_sum <- ma2_summodel <- newModel(fnSimVec = toy_sim, fnSum = toy_sum, sumArgs = list(epsilon = 2), theta0 = 0)result1 <- bsl(y = 1:2, n = 100, M = 5e3, model = model, covRandWalk = matrix(1), method = "BSL", plotOnTheFly = TRUE)result2 <- bsl(y = 1:2, n = 100, M = 5e3, model = model, covRandWalk = matrix(1), method = "uBSL", plotOnTheFly = TRUE)result3 <- bsl(y = 1:2, n = 100, M = 5e3, model = model, covRandWalk = matrix(1), method = "semiBSL", plotOnTheFly = TRUE)combinePlotsBSL(list(result1, result2, result3), label = c("BSL","uBSL","semiBSL"), thin = 20)## End(Not run)Convert a correlation matrix to a covariance matrix
Description
This function converts a correlation matrix to a covariance matrix
Usage
cor2cov(corr, std)Arguments
corr | The correlation matrix to be converted. This must be symmetric. |
std | A vector that contains the standard deviations of the variables in the correlation matrix. |
Value
The covariance matrix.
Estimate the synthetic likelihood
Description
This function computes the estimated synthetic (log) likelihoodusing one of the four methods (“BSL”, “uBSL”, “semiBSL” and“BSLmisspec”). Please find the links below in the see also section formore details.
Usage
estimateLoglike( ssy, ssx, method = c("BSL", "uBSL", "semiBSL", "BSLmisspec"), log = TRUE, verbose = FALSE, ...)Arguments
ssy | The observed summary statisic. |
ssx | A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration. |
method | A string argument indicating the method to be used. Thedefault, “BSL”, runs standard BSL. “uBSL” uses the unbiased estimatorof a normal density of Ghurye and Olkin (1969). “semiBSL”runs the semi-parametric BSL algorithm and is more robust to non-normalsummary statistics. “BSLmisspec” estimates the Gaussian syntheticlikelihood whilst acknowledging that there may be incompatibility betweenthe model and the observed summary statistic (Frazier and Drovandi 2021). |
log | A logical argument indicating if the log of likelihood isgiven as the result. The default is |
verbose | A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault is |
... | Arguments to be passed to methods.
|
Value
The estimated synthetic (log) likelihood value.
References
Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.
Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.
Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.
Ghurye SG, Olkin I (1969).“Unbiased Estimation of Some Multivariate Probability Densities and Related Functions.”Ann. Math. Statist.,40(4), 1261–1271.
Neal RM (2003).“Slice sampling.”The Annals of Statistics,31(3), 705–767.
Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.
See Also
gaussianSynLike,gaussianSynLikeGhuryeOlkin,semiparaKernelEstimate andsynLikeMisspec.
Examples
data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start)ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssxestimateLoglike(ssy, ssx, method = "BSL")estimateLoglike(ssy, ssx, method = "uBSL")estimateLoglike(ssy, ssx, method = "semiBSL")estimateLoglike(ssy, ssx, method = "BSLmisspec", type = "mean", gamma = rep(0.1, 50))Estimate the Whitening matrix to be used in the “wBSL” method ofPriddle et al. (2021)
Description
This function estimates the Whitening matrix to be used in BSLwith Warton's shrinkage and Whitening (“wBSL” method ofPriddle et al. (2021)). The Whitening transformation anddecorrelation methods are detailed in Kessy et al. (2018).
Usage
estimateWhiteningMatrix( n, model, method = c("PCA", "ZCA", "Cholesky", "ZCA-cor", "PCA-cor"), thetaPoint = NULL, parallel = FALSE, parallelArgs = NULL)Arguments
n | The number of model simulations to estimate the Whitening matrix. |
model | A “MODEL” object generated with function |
method | The type of Whitening method to be used. The default is“PCA”. |
thetaPoint | A point estimate of the parameter value with non-negligibleposterior support. |
parallel | A logical value indicating whether parallel computing shouldbe used for simulation and summary statistic evaluation. The default is |
parallelArgs | A list of additional arguments to pass into the |
Value
The estimated Whitening matrix.
References
Kessy A, Lewin A, Strimmer K (2018).“Optimal Whitening and Decorrelation.”The American Statistician,72(4), 309–314.doi:10.1080/00031305.2016.1277159.
Priddle JW, Sisson SA, Frazier DT, Turner I, Drovandi C (2021).“Efficient Bayesian Synthetic Likelihood with Whitening Transformations.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1909.04857.
Examples
## Not run: data(ma2)model <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start)W <- estimateWhiteningMatrix(20000, model, method = "PCA", thetaPoint = c(0.6, 0.2))## End(Not run)Gaussian rank correlation
Description
This function computes the Gaussian rank correlation ofBoudt et al. (2012).
Usage
gaussianRankCorr(x, vec = FALSE)Arguments
x | A numeric matrix representing data where the number of rowsis the number of independent data points and the number of columns is thenumber of variables in the dataset. |
vec | A logical argument indicating if the vector of correlationsshould be returned instead of a matrix. |
Value
Gaussian rank correlation matrix (default) or a vector ofpair correlations.
References
Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.
See Also
cor2cov for conversion from correlation matrixto covariance matrix.
Examples
data(ma2)model <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = list(TT = 10), theta0 = ma2$start, fnLogPrior = ma2_prior)set.seed(100)# generate 1000 simualtions from the ma2 modelx <- simulation(model, n = 1000, theta = c(0.6, 0.2))$xcorr1 <- cor(x) # traditional correlation matrixcorr2 <- gaussianRankCorr(x) # Gaussian rank correlation matrixoldpar <- par()par(mfrow = c(1, 2))image(corr1, main = 'traditional correlation matrix')image(corr2, main = 'Gaussian rank correlation matrix')par(mfrow = oldpar$mfrow)std <- apply(x, MARGIN = 2, FUN = sd) # standard deviationscor2cov(gaussianRankCorr(x), std) # convert to covariance matrixEstimate the Gaussian synthetic (log) likelihood
Description
This function estimates the Gaussian synthetic log-likelihood(see Wood 2010 and Price et al. 2018). Several extensions areprovided in this function:shrinkage enables shrinkage estimation ofthe covariance matrix and is helpful to bring down the number of modelsimulations (see An et al. (2019) for an example of BSLwith glasso (Friedman et al. 2008) shrinkage estimation);GRC uses Gaussian rank correlation (Boudt et al. 2012) tofind a more robust correlation matrix;whitening(Kessy et al. 2018) could further reduce the number of modelsimulations upon Warton's shrinkage (Warton 2008) bydecorrelating the summary statistics.
Usage
gaussianSynLike( ssy, ssx, shrinkage = NULL, penalty = NULL, standardise = FALSE, GRC = FALSE, whitening = NULL, ssyTilde = NULL, log = TRUE, verbose = FALSE)Arguments
ssy | The observed summary statisic. |
ssx | A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration. |
shrinkage | A string argument indicating which shrinkage method tobe used. The default is |
penalty | The penalty value to be used for the specified shrinkagemethod. Must be between zero and one if the shrinkage method is “Warton”. |
standardise | A logical argument that determines whether to standardisethe summary statistics before applying the graphical lasso. This is onlyvalid if method is “BSL”, shrinkage is “glasso” and penalty is not |
GRC | A logical argument indicating whether the Gaussian rankcorrelation matrix (Boudt et al. 2012) should be used to estimatethe covariance matrix in “BSL” method. The default is |
whitening | This argument determines whether Whitening transformationshould be used in “BSL” method with Warton's shrinkage. Whiteningtransformation helps decorrelate the summary statistics, thus encouragingsparsity of the synthetic likelihood covariance matrix. This might allowheavier shrinkage to be applied without losing much accuracy, henceallowing the number of simulations to be reduced. By default, |
ssyTilde | The whitened observed summary statisic. If this is not |
log | A logical argument indicating if the log of likelihood isgiven as the result. The default is |
verbose | A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault is |
Value
The estimated synthetic (log) likelihood value.
References
An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.
Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.
Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.
Kessy A, Lewin A, Strimmer K (2018).“Optimal Whitening and Decorrelation.”The American Statistician,72(4), 309–314.doi:10.1080/00031305.2016.1277159.
Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.
Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.
Wood SN (2010).“Statistical Inference for Noisy Nonlinear Ecological Dynamic Systems.”Nature,466, 1102–1107.doi:10.1038/nature09319.
See Also
Other available synthetic likelihood estimators:gaussianSynLikeGhuryeOlkin for the unbiased syntheticlikelihood estimator,semiparaKernelEstimate for thesemi-parametric likelihood estimator,synLikeMisspec for theGaussian synthetic likelihood estimator for model misspecification.
Examples
data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start)ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssx# the standard Gaussian synthetic likelihood (the likelihood estimator used in BSL)gaussianSynLike(ssy, ssx)# the Gaussian synthetic likelihood with glasso shrinkage estimation# (the likelihood estimator used in BSLasso)gaussianSynLike(ssy, ssx, shrinkage = 'glasso', penalty = 0.1)# the Gaussian synthetic likelihood with Warton's shrinkage estimationgaussianSynLike(ssy, ssx, shrinkage = 'Warton', penalty = 0.9)# the Gaussian synthetic likelihood with Warton's shrinkage estimation and Whitening transformationW <- estimateWhiteningMatrix(20000, m)gaussianSynLike(ssy, ssx, shrinkage = 'Warton', penalty = 0.9, whitening = W)Estimate the Gaussian synthetic (log) likelihood with an unbiased estimator
Description
This function computes an unbiased, nonnegative estimate of anormal density function from simulations assumed to be drawn from it. SeePrice et al. (2018) andGhurye and Olkin (1969).
Usage
gaussianSynLikeGhuryeOlkin(ssy, ssx, log = TRUE, verbose = FALSE)Arguments
ssy | The observed summary statisic. |
ssx | A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration. |
log | A logical argument indicating if the log of likelihood isgiven as the result. The default is |
verbose | A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault is |
Value
The estimated synthetic (log) likelihood value.
References
Ghurye SG, Olkin I (1969).“Unbiased Estimation of Some Multivariate Probability Densities and Related Functions.”Ann. Math. Statist.,40(4), 1261–1271.
Price LF, Drovandi CC, Lee A, Nott DJ (2018).“Bayesian Synthetic Likelihood.”Journal of Computational and Graphical Statistics,27, 1–11.doi:10.1080/10618600.2017.1302882.
See Also
Other available synthetic likelihood estimators:gaussianSynLike for the standard synthetic likelihoodestimator,semiparaKernelEstimate for the semi-parametriclikelihood estimator,synLikeMisspec for the Gaussiansynthetic likelihood estimator for model misspecification.
Examples
data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start)ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssx# unbiased estimate of the Gaussian synthetic likelihood# (the likelihood estimator used in uBSL)gaussianSynLikeGhuryeOlkin(ssy, ssx)Obtain the gamma samples (the latent parameters for BSLmisspec method) from a"BSL" object
Description
seeBSLclass
Usage
getGamma(object, ...)Arguments
object | A “BSL” class object. |
... | Other arguments. |
Value
The matrix of gamma samples (the latent parameters for BSLmisspecmethod), after removing burn-in and thinning.
Obtain the log-likelihoods from a "BSL" object
Description
seeBSLclass
Usage
getLoglike(object, ...)Arguments
object | A “BSL” class object. |
... | Other arguments. |
Value
The vector of log likelihood evaluations, after removing burn-in and thinning.
Obtain the selected penalty values from a "PENALTY" object
Description
seePENALTYclass
Usage
getPenalty(object, ...)Arguments
object | A “PENALTY” class object. |
... | Other arguments. |
Value
The selecty penalty values.
Obtain the samples from a "BSL" object
Description
seeBSLclass
Usage
getTheta(object, ...)Arguments
object | A “BSL” class object. |
... | Other arguments. |
Value
The matrix of samples, after removing burn-in and thinning.
An MA(2) model
Description
In this example we wish to estimate the parameters of a simpleMA(2) time series model. We provide the data and tuning parameters requiredto reproduce the results in An et al. (2019).The journal article An et al. (2022) provides a fulldescription of how to use this package for the toad example.
Usage
data(ma2)ma2_sim(theta, TT)ma2_sim_vec(n, theta, TT)ma2_sum(x, epsilon = 0, delta = 1)ma2_prior(theta)Arguments
theta | A vector of proposed model parameters, |
TT | The number of observations. |
n | The number of simulations to run with the vectorisedsimulation function. |
x | Observed or simulated data in the format of a vector of length |
epsilon | The skewness parameter in the sinh-arcsinh transformation. |
delta | The kurtosis parameter in the sinh-arcsinh transformation. |
Details
This example is based on estimating the parameters of a basic MA(2)time series model of the form
y_t = z_t + \theta_1 z_{t-1} + \theta_2 z_{t-2},
wheret=1,\ldots,TT andz_t \sim N(0,1)fort=-1,0,\ldots,TT. A uniformprior is used for this example, subject to the restrictions that-2<\theta_1<2,\theta_1+\theta_2>-1and\theta_1-\theta_2<1so that invertibility of the time series is satisfied. The summarystatistics are simply the full data.
Functions
ma2_sim: Simulates an MA(2) time series.ma2_sim_vec: Simulates n MA(2) time series with a vectorised simulationfunction.ma2_sum: Returns the summary statistics for a given data set. Theskewness and kurtosis of the summary statistics can be controlled via the\epsilonand\deltaparameters. This is thesinh-arcsinnh transformation of Jones and Pewsey (2009). By default,the summary statistics function simply returns the raw data. Otherwise, thetransformation is introduced to motivate the “semiBSL” method.ma2_prior: Evaluates the (unnormalised) log prior, which is uniformsubject to several restrictions related to invertibility of the timeseries.
A simulated dataset
An example “observed” dataset and the tuning parameters relevant to thatexample can be obtained usingdata(ma2). This “observed” data is asimulated dataset with\theta_1 = 0.6,\theta_2=0.2 andTT=50. Further information about this model and the specific choicesof tuning parameters used in BSL and BSLasso can be found in An et al.(2019).
data: A time series dataset, in the form of a vector of lengthTTsim_args: A list containingTT=50start: A vector of suitable initial values of the parametersfor MCMCcov: The covariance matrix of a multivariate normal randomwalk proposal distribution used in the MCMC, in the form of a 2\times2 matrix
Author(s)
Ziwen An, Leah F. South and Christopher Drovandi
References
An Z, South LF, Drovandi CC (2022).“BSL: An R Package for Efficient Parameter Estimation for Simulation-Based Models via Bayesian Synthetic Likelihood.”Journal of Statistical Software,101(11), 1–33.doi:10.18637/jss.v101.i11.
An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.
Jones MC, Pewsey A (2009).“Sinh-arcsinh distributions.”Biometrika,96(4), 761–780.ISSN 0006-3444.
Examples
## Not run: # Load the data for this example and set up the model objectdata(ma2)model <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start, fnLogPrior = ma2_prior)thetaExact <- c(0.6, 0.2)# reduce the number of iterations M if desired for all methods below# Method 1: standard BSLresultMa2BSL <- bsl(y = ma2$data, n = 500, M = 300000, model = model, covRandWalk = ma2$cov, method = "BSL", verbose = 1L)show(resultMa2BSL)summary(resultMa2BSL)plot(resultMa2BSL, thetaTrue = thetaExact, thin = 20)# Method 2: unbiased BSLresultMa2uBSL <- bsl(y = ma2$data, n = 500, M = 300000, model = model, covRandWalk=ma2$cov, method = "uBSL", verbose = 1L)show(resultMa2uBSL)summary(resultMa2uBSL)plot(resultMa2uBSL, thetaTrue = thetaExact, thin = 20)# Method 3: BSLasso (BSL with glasso shrinkage estimation)# tune the penalty parameter fisrtssy <- ma2_sum(ma2$data)lambdaAll <- list(exp(seq(-5.5,-1.5,length.out=20)))set.seed(100)penaltyGlasso <- selectPenalty(ssy = ssy, n = 300, lambdaAll, theta = thetaExact, M = 100, sigma = 1.5, model = model, method = "BSL", shrinkage = "glasso")penaltyGlassoplot(penaltyGlasso)resultMa2BSLasso <- bsl(y = ma2$data, n = 300, M = 250000, model = model, covRandWalk=ma2$cov, method = "BSL", shrinkage = "glasso", penalty = 0.027, verbose = 1L)show(resultMa2BSLasso)summary(resultMa2BSLasso)plot(resultMa2BSLasso, thetaTrue = thetaExact, thin = 20)# Method 4: BSL with Warton's shrinkage and Whitening# estimate the Whtieing matrix and tune the penalty parameter firstW <- estimateWhiteningMatrix(20000, model, method = "PCA", thetaPoint = ma2$start)gammaAll <- list(seq(0.3, 0.8, 0.02))set.seed(100)penaltyWarton <- selectPenalty(ssy = ssy, n = 300, gammaAll, theta = thetaExact, M = 100, sigma = 1.2, model = model, method = "BSL", shrinkage = "Warton", whitening = W)penaltyWartonplot(penaltyWarton, logscale = FALSE)resultMa2Whitening <- bsl(y = ma2$data, n = 300, M = 250000, model = model, covRandWalk=ma2$cov, method = "BSL", shrinkage = "Warton", whitening = W, penalty = 0.52, verbose = 1L)show(resultMa2Whitening)summary(resultMa2Whitening)plot(resultMa2Whitening, thetaTrue = thetaExact, thin = 20)# Method 5: semiBSL, the summary statistics function is different from previous methodsmodel2 <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args, sumArgs = list(epsilon = 2), theta0 = ma2$start, fnLogPrior = ma2_prior)sim <- simulation(model, n = 1e4, theta = ma2$start, seed = 1) # run a short simulationplot(density(sim$ssx[, 1])) # the first marginal summary statistic is right-skewedresultMa2SemiBSL <- bsl(y = ma2$data, n = 500, M = 200000, model = model2, covRandWalk=ma2$cov, method = "semiBSL", verbose = 1L)show(resultMa2SemiBSL)summary(resultMa2SemiBSL)plot(resultMa2SemiBSL, thetaTrue = thetaExact, thin = 20)# Method 6: BSL with consideration of model misspecification (mean adjustment)resultMa2Mean <- bsl(y = ma2$data, n = 500, M = 200000, model = model, covRandWalk=ma2$cov, method = "BSLmisspec", misspecType = "mean", verbose = 1L)show(resultMa2Mean)summary(resultMa2Mean)plot(resultMa2Mean, thetaTrue = thetaExact, thin = 20)# Method 7: BSL with consideration of model misspecification (variance inflation)resultMa2Variance <- bsl(y = ma2$data, n = 500, M = 200000, model = model, covRandWalk=ma2$cov, method = "BSLmisspec", misspecType = "variance", verbose = 1L)show(resultMa2Variance)summary(resultMa2Variance)plot(resultMa2Variance, thetaTrue = thetaExact, thin = 20)# Plotting the results together for comparison# plot using the R default plot functionoldpar <- par()par(mar = c(5, 4, 1, 2), oma = c(0, 1, 2, 0))combinePlotsBSL(list(resultMa2BSL, resultMa2uBSL, resultMa2BSLasso, resultMa2SemiBSL), which = 1, thetaTrue = thetaExact, thin = 20, label = c("bsl", "uBSL", "bslasso", "semiBSL"), col = c("black", "red", "blue", "green"), lty = 1:4, lwd = 1)mtext("Approximate Univariate Posteriors", outer = TRUE, cex = 1.5)# plot using the ggplot2 packagecombinePlotsBSL(list(resultMa2BSL, resultMa2uBSL, resultMa2BSLasso, resultMa2SemiBSL), which = 2, thetaTrue = thetaExact, thin = 20, label = c("bsl", "ubsl", "bslasso", "semiBSL"), options.color = list(values=c("black", "red", "blue", "green")), options.linetype = list(values = 1:4), options.size = list(values = rep(1, 4)), options.theme = list(plot.margin = grid::unit(rep(0.03,4), "npc"), axis.title = ggplot2::element_text(size=12), axis.text = ggplot2::element_text(size = 8), legend.text = ggplot2::element_text(size = 12)))par(mar = oldpar$mar, oma = oldpar$oma)## End(Not run)The multivariate G&K example
Description
Here we provide the data and tuning parameters required to reproducethe results from the multivariate G & K (Drovandi and Pettitt 2011) example from An et al. (2019).
Usage
data(mgnk)mgnk_sim(theta_tilde, TT, J, bound)mgnk_sum(y)Arguments
theta_tilde | A vector with 15 elements for the proposed model parameters. |
TT | The number of observations in the data. |
J | The number of variables in the data. |
bound | A matrix of boundaries for the uniform prior. |
y | A |
Details
It is not practical to give a reasonable explanation of this example through R documentationgiven the number of equations involved. We refer the reader to the BSLasso paper (An et al. 2019)at <doi:10.1080/10618600.2018.1537928> for information on the model and summary statistic used in this example.
An example dataset
We use the foreign currency exchange data available fromhttps://www.rba.gov.au/statistics/historical-data.htmlas in An et al. (2019).
data: A1651\times3matrix of data.sim_args: Values ofsim_argsrelevant to this example.start: A vector of suitable initial values of the parameters for MCMC.cov: The covariance matrix of a multivariate normal random walk proposal distribution used in the MCMC, in the form of a 15 by 15 matrix
Author(s)
Ziwen An, Leah F. South and Christopher Drovandi
References
An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.
Drovandi CC, Pettitt AN (2011).“Likelihood-free Bayesian estimation of multivariate quantile distributions.”Computational Statistics & Data Analysis,55(9), 2541–2556.ISSN 0167-9473, doi:10.1016/j.csda.2011.03.019.
Examples
## Not run: require(doParallel) # You can use a different package to set up the parallel backendrequire(MASS)require(elliplot)# Loading the data for this exampledata(mgnk)model <- newModel(fnSim = mgnk_sim, fnSum = mgnk_sum, simArgs = mgnk$sim_args, theta0 = mgnk$start, thetaNames = expression(a[1],b[1],g[1],k[1],a[2],b[2],g[2],k[2], a[3],b[3],g[3],k[3],delta[12],delta[13],delta[23]))# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkBSL <- bsl(mgnk$data, n = 60, M = 80000, model = model, covRandWalk = mgnk$cov, method = "BSL", parallel = FALSE, verbose = 1L, plotOnTheFly = TRUE)stopCluster(cl)registerDoSEQ()show(resultMgnkBSL)summary(resultMgnkBSL)plot(resultMgnkBSL, which = 2, thin = 20)# Performing uBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkuBSL <- bsl(mgnk$data, n = 60, M = 80000, model = model, covRandWalk = mgnk$cov, method = "uBSL", parallel = FALSE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultMgnkuBSL)summary(resultMgnkuBSL)plot(resultMgnkuBSL, which = 2, thin = 20)# Performing tuning for BSLassossy <- mgnk_sum(mgnk$data)lambda_all <- list(exp(seq(-2.5,0.5,length.out=20)), exp(seq(-2.5,0.5,length.out=20)), exp(seq(-4,-0.5,length.out=20)), exp(seq(-5,-2,length.out=20)))# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)set.seed(100)sp_mgnk <- selectPenalty(ssy, n = c(15, 20, 30, 50), lambda = lambda_all, theta = mgnk$start, M = 100, sigma = 1.5, model = model, method = "BSL", shrinkage = "glasso", standardise = TRUE, parallelSim = TRUE, parallelSimArgs = list(.packages = "MASS", .export = "ninenum"), parallelMain = TRUE)stopCluster(cl)registerDoSEQ()sp_mgnkplot(sp_mgnk)# Performing BSLasso with a fixed penalty (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkBSLasso <- bsl(mgnk$data, n = 20, M = 80000, model = model, covRandWalk = mgnk$cov, method = "BSL", shrinkage = "glasso", penalty = 0.3, standardise = TRUE, parallel = FALSE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultMgnkBSLasso)summary(resultMgnkBSLasso)plot(resultMgnkBSLasso, which = 2, thin = 20)# Performing semiBSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultMgnkSemiBSL <- bsl(mgnk$data, n = 60, M = 80000, model = model, covRandWalk = mgnk$cov, method = "semiBSL", parallel = FALSE, verbose = 1L)stopCluster(cl)registerDoSEQ()show(resultMgnkSemiBSL)summary(resultMgnkSemiBSL)plot(resultMgnkSemiBSL, which = 2, thin = 20)# Plotting the results together for comparison# plot using the R default plot functionoldpar <- par()par(mar = c(4, 4, 1, 1), oma = c(0, 1, 2, 0))combinePlotsBSL(list(resultMgnkBSL, resultMgnkuBSL, resultMgnkBSLasso, resultMgnkSemiBSL), which = 1, thin = 20, label = c("bsl", "ubsl", "bslasso", "semiBSL"), col = c("red", "yellow", "blue", "green"), lty = 2:5, lwd = 1)mtext("Approximate Univariate Posteriors", outer = TRUE, line = 0.75, cex = 1.2)# plot using the ggplot2 packagecombinePlotsBSL(list(resultMgnkBSL, resultMgnkuBSL, resultMgnkBSLasso, resultMgnkSemiBSL), which = 2, thin = 20, label=c("bsl","ubsl","bslasso","semiBSL"), options.color=list(values=c("red","yellow","blue","green")), options.linetype = list(values = 2:5), options.size = list(values = rep(1, 4)), options.theme = list(plot.margin = grid::unit(rep(0.03,4),"npc"), axis.title = ggplot2::element_text(size=12), axis.text = ggplot2::element_text(size = 8), legend.text = ggplot2::element_text(size = 12)))par(mar = oldpar$mar, oma = oldpar$oma)## End(Not run)Progress Bar
Description
Print a customisable progress bar in the console.
Usage
myMiniProgressBar(p, txt1 = "", txt2 = "", style = 1, label = c("=", "-", "|"))Arguments
p | Numeric, percentage of finished progress, between 0 and 1. |
txt1 | String to put before the progress bar |
txt2 | String to put after the progress bar |
style | The display style. 1 is single-lined; 2 is double-lined; 3 display the progress in a 5-lined block. |
label | Character labels for "finished", "un-finished", and "side bars". |
Value
No return value, called for side effects.
Convert an observation matrix to a vector of n-day displacements
Description
Convert an observation matrix to a vector of n-daydisplacements. This is a function for the toad example.
Usage
obsMat2deltax(X, lag)Arguments
X | The observation matrix to be converted. |
lag | Interger, the number of day lags to compute the displacement. |
Value
A vector of displacements.
Generate a random sample from the zero-centered stable distribution
Description
Draw a sample from a symmetric, zero-centered stable distribution withgiven scale and stability (alpha) parameters, using the CMS algorithm.
Usage
rstable(scale, alpha)Arguments
scale | The scale parameter. |
alpha | The stability parameter. |
Value
A random sample from the zero-centered stable distribution.
Selecting the Penalty Parameter
Description
This is the main function for selecting the shrinkage (graphicallasso or Warton's estimator) penalty parameter for method BSL or semiBSLbased on a point estimate of the parameters. Parallel computing issupported with the R packageforeach. The penalty selection methodis outlined in An et al. (2019).
Usage
selectPenalty( ssy, n, lambda, M, sigma = 1.5, model, theta = NULL, method = c("BSL", "semiBSL"), shrinkage = c("glasso", "Warton"), parallelSim = FALSE, parallelSimArgs = NULL, parallelMain = FALSE, verbose = 1L, ...)Arguments
ssy | A summary statistic vector for the observeddata. |
n | A vector of possible values of |
lambda | A list, with each entry containing the vector ofpenalty values to test for the corresponding choice of |
M | The number of repeats to use in estimating thestandard deviation of the estimated log synthetic likelihood. |
sigma | The standard deviation of the log syntheticlikelihood estimator to aim for, usually a value between 1 and 2. Thisparameter helps to control the mixing of a Markov chain. |
model | A “MODEL” object generated with function |
theta | A point estimate of the parameter value whichall of the simulations will be based on. By default, if |
method | A string argument indicating the method to beused. If the method is “BSL”, the shrinkage is applied to the Gaussiancovariance matrix. Otherwise if the method is “semiBSL”, the shrinkage isapplied to the correlation matrix of the Gaussian copula. |
shrinkage | A string argument indicating which shrinkage method tobe used. Current options are “glasso” for the graphical lasso method ofFriedman et al. (2008) and “Warton” for the ridgeregularisation method of Warton (2008). |
parallelSim | A logical value indicating whether parallelcomputing should be used for simulation and summary statistic evaluation.Default is |
parallelSimArgs | A list of additional arguments to pass into the |
parallelMain | A logical value indicating whether parallelcomputing should be used to computing the graphical lasso function. Noticethat this should only be turned on when there are a lot of candidate valuesin |
verbose | An integer indicating the verbose style. 0Lmeans no verbose messages will be printed. 1L uses a custom progress bar totrack the progress. 2L prints the iteration numbers ( |
... | Other arguments to pass to |
Value
An S4 objectPENALTY of the penalty selection results. Theshow andplot methods are provided with the S4 class.
Author(s)
Ziwen An, Leah F. South and Christopher Drovandi
References
An Z, South LF, Nott DJ, Drovandi CC (2019).“Accelerating Bayesian Synthetic Likelihood With the Graphical Lasso.”Journal of Computational and Graphical Statistics,28(2), 471–475.doi:10.1080/10618600.2018.1537928.
Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.
Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.
See Also
PENALTY for the usage of the S4 class.ma2,cell andmgnk for examples.bslfor the main function to run BSL.
Examples
## Not run: data(ma2)model <- newModel(fnSimVec = ma2_sim_vec, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start, fnLogPrior = ma2_prior)theta <- c(0.6,0.2)# Performing tuning for BSLasso (BSL with glasso shrinkage estimation)ssy <- ma2_sum(ma2$data)lambda_all <- list(exp(seq(-3,0.5,length.out=20)), exp(seq(-4,-0.5,length.out=20)), exp(seq(-5.5,-1.5,length.out=20)), exp(seq(-7,-2,length.out=20)))set.seed(100)sp_ma2 <- selectPenalty(ssy = ssy, n = c(50, 150, 300, 500), lambda_all, theta = theta, M = 100, sigma = 1.5, model = model, method = 'BSL', shrinkage = 'glasso')sp_ma2plot(sp_ma2)## End(Not run)Estimate the semi-parametric synthetic (log) likelihood
Description
This function computes the semi-parametric synthetic likelihoodestimator of (An et al. 2019). The advantage of thissemi-parametric estimator over the standard synthetic likelihood estimatoris that the semi-parametric one is more robust to non-normal summarystatistics. Kernel density estimation is used for modelling each univariatemarginal distribution, and the dependence structure between summaries arecaptured using a Gaussian copula. Shrinkage on the correlation matrixparameter of the Gaussian copula is helpful in decreasing the number ofsimulations.
Usage
semiparaKernelEstimate( ssy, ssx, kernel = "gaussian", shrinkage = NULL, penalty = NULL, log = TRUE)Arguments
ssy | The observed summary statisic. |
ssx | A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration. |
kernel | A string argument indicating the smoothing kernel to passinto |
shrinkage | A string argument indicating which shrinkage method tobe used. The default is |
penalty | The penalty value to be used for the specified shrinkagemethod. Must be between zero and one if the shrinkage method is “Warton”. |
log | A logical argument indicating if the log of likelihood isgiven as the result. The default is |
Value
The estimated synthetic (log) likelihood value.
References
An Z, Nott DJ, Drovandi C (2019).“Robust Bayesian Synthetic Likelihood via a Semi-Parametric Approach.”Statistics and Computing (In Press).
Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.
Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.
Friedman J, Hastie T, Tibshirani R (2008).“Sparse Inverse Covariance Estimation with the Graphical Lasso.”Biostatistics,9(3), 432–441.
Warton DI (2008).“Penalized Normal Likelihood and Ridge Regularization of Correlation and Covariance Matrices.”Journal of the American Statistical Association,103(481), 340–349.doi:10.1198/016214508000000021.
Boudt K, Cornelissen J, Croux C (2012).“The Gaussian Rank Correlation Estimator: Robustness Properties.”Statistics and Computing,22(2), 471–483.doi:10.1007/s11222-011-9237-0.
See Also
Other available synthetic likelihood estimators:gaussianSynLike for the standard synthetic likelihoodestimator,gaussianSynLikeGhuryeOlkin for the unbiasedsynthetic likelihood estimator,synLikeMisspec for theGaussian synthetic likelihood estimator for model misspecification.
Examples
data(ma2)ssy <- ma2_sum(ma2$data)m <- newModel(fnSim = ma2_sim, fnSum = ma2_sum, simArgs = ma2$sim_args, theta0 = ma2$start, sumArgs = list(delta = 0.5))ssx <- simulation(m, n = 300, theta = c(0.6, 0.2), seed = 10)$ssx# check the distribution of the first summary statistic: highly non-normalplot(density(ssx[, 1]))# the standard synthetic likelihood estimator over-estimates the likelihood heregaussianSynLike(ssy, ssx)# the semi-parametric synthetic likelihood estimator is more robust to non-normalitysemiparaKernelEstimate(ssy, ssx)# using shrinkage on the correlation matrix of the Gaussian copula is also possiblesemiparaKernelEstimate(ssy, ssx, shrinkage = "Warton", penalty = 0.8)The simulation function for the toad example
Description
The simulation function for the toad example.
Usage
sim_toad(params, ntoad, nday, model = 1L, d0 = 100)Arguments
params | A vector of proposed model parameters, |
ntoad | The number of toads to simulate in the observation. |
nday | The number of days lasted of the observation. |
model | Which model to be used. 1 for the random return model, 2 for the nearest return model,and 3 for the distance-based return probability model. |
d0 | Characteristic distance for model 3. Only used if model is 3. |
Value
A data matrix.
Examples
sim_toad(c(1.7,36,0.6), 10, 8, 1)Simulation function of the cell biology example
Description
Simulation function of the cell biology example.
Usage
simulate_cell(x, rows, cols, Pm, Pp, sim_iters, num_obs)Arguments
x | The initial matrix of cell presences of size |
rows | The number of rows in the lattice (rows in the cell locationmatrix). |
cols | The number of columns in the lattice (columns in the celllocation matrix). |
Pm | Parameter |
Pp | Parameter |
sim_iters | The number of discretisation steps to get to when anobservation is actually taken. For example, if observations are taken every5 minutes but the discretisation level is 2.5 minutes, then |
num_obs | The total number of images taken after initialisation. |
Value
Arows\timescols\timesnum_obs arrayof the cell presences at times1:num_obs (not time 0).
Run simulations with a give "MODEL" object
Description
seeMODEL
Usage
simulation(model, ...)Arguments
model | A “MODEL” object. |
... | Other arguments. |
Generate a random sample of gamma for the R-BSL-M method ofFrazier and Drovandi (2021) using slice sampling
Description
This function updates the gamma of the R-BSL-M method ofFrazier and Drovandi (2021) with a slice sampler(Neal 2003). Note this function is mainly designed forinternal usage.
Usage
sliceGammaMean( ssy, ssx, loglike, gamma = numeric(length(ssy)), tau = 1, w = 1, std = NULL, maxit = 1000)Arguments
ssy | The observed summary statisic. |
ssx | A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration. |
loglike | The current log synthetic likelihood. This is computed withfunction |
gamma | The additional latent parameter to account for possibleincompatability between the model and observed summary statistic. In“BSLmisspec” method, this is updated with a slice sampler(Neal 2003). The default gamma implies no model misspecificationand is equivalent to the standard |
tau | Scale (or inverse rate) parameter of the Laplace priordistribution for gamma. |
w | Step size for the stepping out in the slice sampler. The defaultstep size is 1. |
std | Standard deviation of the columns of ssx. If this is not |
maxit | The maximum number of iteration of the stepping out and shrinksteps of slice sampler. The default is 1e3. |
References
Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.
Neal RM (2003).“Slice sampling.”The Annals of Statistics,31(3), 705–767.
See Also
sliceGammaVariance for the slice sampler of thevariance inflated target distribution.
Generate a random sample of gamma for the R-BSL-V method ofFrazier and Drovandi (2021) using slice sampling
Description
This function updates the gamma parameter with a slice sampler(Neal 2003). The target distribution is the varianceinflated approximate posterior of BSL with model misspecification. SeeFrazier and Drovandi (2021). Note this function is mainlydesigned for internal usage.
Usage
sliceGammaVariance( ssy, ssx, loglike, gamma = numeric(length(ssy)), tau = 1, w = 1, std = NULL, maxit = 1000)Arguments
ssy | The observed summary statisic. |
ssx | A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration. |
loglike | The current log synthetic likelihood. This is computed withfunction |
gamma | The additional latent parameter to account for possibleincompatability between the model and observed summary statistic. In“BSLmisspec” method, this is updated with a slice sampler(Neal 2003). The default gamma implies no model misspecificationand is equivalent to the standard |
tau | Numeric. Scale (or inverse rate) parameter of the exponentialprior distribution. |
w | Step size for the stepping out in the slice sampler. The defaultstep size is 1. |
std | Standard deviation of the columns of ssx. If this is not |
maxit | The maximum number of iteration of the stepping out and shrinksteps of slice sampler. The default is 1e3. |
References
Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.
Neal RM (2003).“Slice sampling.”The Annals of Statistics,31(3), 705–767.
See Also
sliceGammaMean for the slice sampler of the meanadjusted target distribution.
Compute the summary statistics with the given data
Description
seeMODEL
Usage
summStat(x, model)Arguments
x | The data to pass to the summary statistics function. |
model | A “MODEL” object. |
Estimate the Gaussian synthetic (log) likelihood whilst acknowledging modelincompatibility
Description
This function estimates the Gaussian synthetic likelihood whilstacknowledging that there may be incompatibility between the model and theobserved summary statistic. The method has two different ways toaccount for and detect incompatibility (mean adjustment and varianceinflation). An additional free parametergamma is employed to account for themodel misspecification. See the R-BSL methods ofFrazier and Drovandi (2021) for more details. Note this functionis mainly designed for interal use as the latent variablegamma needto be chosen otherwise. Alternatively,gamma is updated with a slicesampler (Neal 2003), which is the method ofFrazier and Drovandi (2021).
Usage
synLikeMisspec( ssy, ssx, type = c("mean", "variance"), gamma = numeric(length(ssy)), log = TRUE, verbose = FALSE)Arguments
ssy | The observed summary statisic. |
ssx | A matrix of the simulated summary statistics. The numberof rows is the same as the number of simulations per iteration. |
type | A string argument indicating which method is used to account forand detect potential incompatibility. The two options are "mean" and"variance" for mean adjustment and variance inflation, respectively. |
gamma | The additional latent parameter to account for possibleincompatability between the model and observed summary statistic. In“BSLmisspec” method, this is updated with a slice sampler(Neal 2003). The default gamma implies no model misspecificationand is equivalent to the standard |
log | A logical argument indicating if the log of likelihood isgiven as the result. The default is |
verbose | A logical argument indicating whether an error messageshould be printed if the function fails to compute a likelihood. Thedefault is |
Value
The estimated synthetic (log) likelihood value.
References
Frazier DT, Drovandi C (2021).“Robust Approximate Bayesian Inference with Synthetic Likelihood.”Journal of Computational and Graphical Statistics (In Press).https://arxiv.org/abs/1904.04551.
Neal RM (2003).“Slice sampling.”The Annals of Statistics,31(3), 705–767.
See Also
Other available synthetic likelihood estimators:gaussianSynLike for the standard synthetic likelihoodestimator,gaussianSynLikeGhuryeOlkin for the unbiasedsynthetic likelihood estimator,semiparaKernelEstimate forthe semi-parametric likelihood estimator,synLikeMisspec forthe Gaussian synthetic likelihood estimator for model misspecification.Slice sampler to sample gammasliceGammaMean andsliceGammaVariance (internal functions).
Examples
# a toy model (for details see section 4.1 from Frazier et al 2019)# the true underlying model is a normal distribution with standard deviation equals to 0.2# whist the data generation process has the standard deviation fixed to 1set.seed(1)y <- rnorm(50, 1, sd = 0.2)ssy <- c(mean(y), var(y))m <- newModel(fnSim = function(theta) rnorm(50, theta), fnSum = function(x) c(mean(x), var(x)), theta0 = 1, fnLogPrior = function(x) log(dnorm(x, sd = sqrt(10))))ssx <- simulation(m, n = 300, theta = 1, seed = 10)$ssx# gamma is updated with a slice samplergamma <- rep(0.1, length(ssy))synLikeMisspec(ssy, ssx, type = "variance", gamma = gamma)Toad example
Description
This example estimates the parameter for the toad example. Themodel simulates the movement of an amphibian called Fowler's toad. Themodel is proposed by Marchand et al. (2017). This exampleincludes both simulated and real data. The real data is obtained from the supplementary material of Marchand et al. (2017).The journal article An et al. (2022) provides a fulldescription of how to use this package for the toad example.
Usage
data(toad)toad_sim( theta, ntoads, ndays, model = 1, d0 = 100, na = matrix(FALSE, ndays, ntoads))toad_sum(X, lag = c(1, 2, 4, 8), p = seq(0, 1, 0.1))toad_prior(theta)Arguments
theta | A vector of proposed model parameters, |
ntoads | The number of toads to simulate in the observation. |
ndays | The number of days observed. |
model | Which model to be used: 1 for the random return model, 2 for thenearest return model, and 3 for the distance-based return probabilitymodel. The default is 1. |
d0 | Characteristic distance for model 3. Only used if |
na | Logical. This is the index matrix for missing observations. Bydefault, |
X | The data matrix. |
lag | The lag of days to compute the summary statistics, default as 1,2, 4 and 8. |
p | The numeric vector of probabilities to compute the quantiles. |
Details
The example includes the three different returning models ofMarchand et al. (2017). Please seeMarchand et al. (2017) for a full description of the toadmodel, and also An et al. (2019) for Bayesian inferencewith the semi-BSL method.
Functions
toad_sim: Simulates data from the model, using C++ in the backend.toad_sum: Computes the summary statistics for this example. The summary statistics are the log differences between adjacent quantiles and also the median.toad_prior: Evaluates the log prior at the chosen parameters.
datasets (simulated and real)
A simulated dataset and a real dataset are provided in this example. Bothdatasets contain observations from 66 toads for 63 days. The simulateddataset is simulated with parameter\theta = (1.7, 35, 0.6). This is the data used in An et al. (2019). The realdataset is obtained from the supplementary data ofMarchand et al. (2017).
data_simulated: A 63\times66 matrix of the observedtoad locations (simulated data).data_real: A 63\times66 matrix of the observedtoad locations (real data).cov: The covariance matrix of a multivariate normal randomwalk proposal distribution used in the MCMC, in the form of a 3\times3 matrix.theta0: A vector of suitable initial values of the parametersfor MCMC.sim_args_simulatedandsim_args_real: A list of thearguments to pass into the simulation function.ndays: The number of days observed.ntoads: The total number of toads being observed.model: Indicator of which model to be used.na: Indicator matrix for missingness.
Author(s)
Ziwen An, Leah F. South andChristopher Drovandi
References
An Z, Nott DJ, Drovandi C (2019).“Robust Bayesian Synthetic Likelihood via a Semi-Parametric Approach.”Statistics and Computing (In Press).
An Z, South LF, Drovandi CC (2022).“BSL: An R Package for Efficient Parameter Estimation for Simulation-Based Models via Bayesian Synthetic Likelihood.”Journal of Statistical Software,101(11), 1–33.doi:10.18637/jss.v101.i11.
Marchand P, Boenke M, Green DM (2017).“A stochastic movement model reproduces patterns of site fidelity and long-distance dispersal in a population of Fowlers toads (Anaxyrus fowleri).”Ecological Modelling,360, 63–69.ISSN 0304-3800, doi:10.1016/j.ecolmodel.2017.06.025.()
Examples
## Not run: require(doParallel) # You can use a different package to set up the parallel backenddata(toad)## run standard BSL for the simulated datasetmodel1 <- newModel(fnSim = toad_sim, fnSum = toad_sum, theta0 = toad$theta0, fnLogPrior = toad_prior, simArgs = toad$sim_args_simulated, thetaNames = expression(alpha,gamma,p[0]))paraBound <- matrix(c(1,2,0,100,0,0.9), 3, 2, byrow = TRUE)# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultToadSimulated <- bsl(toad$data_simulated, n = 1000, M = 10000, model = model1, covRandWalk = toad$cov, logitTransformBound = paraBound, parallel = TRUE, verbose = 1L, plotOnTheFly = 100)stopCluster(cl)registerDoSEQ()show(resultToadSimulated)summary(resultToadSimulated)plot(resultToadSimulated, thetaTrue = toad$theta0, thin = 20)## run standard BSL for the real datasetmodel2 <- newModel(fnSim = toad_sim, fnSum = toad_sum, theta0 = toad$theta0, fnLogPrior = toad_prior, simArgs = toad$sim_args_real, thetaNames = expression(alpha,gamma,p[0]))paraBound <- matrix(c(1,2,0,100,0,0.9), 3, 2, byrow = TRUE)# Performing BSL (reduce the number of iterations M if desired)# Opening up the parallel pools using doParallelcl <- makeCluster(min(detectCores() - 1,2))registerDoParallel(cl)resultToadReal <- bsl(toad$data_real, n = 1000, M = 10000, model = model2, covRandWalk = toad$cov, logitTransformBound = paraBound, parallel = TRUE, verbose = 1L, plotOnTheFly = 100)stopCluster(cl)registerDoSEQ()show(resultToadReal)summary(resultToadReal)plot(resultToadReal, thetaTrue = toad$theta0, thin = 20)## End(Not run)