| Version: | 0.19-4.1 |
| Date: | 2020-09-30 |
| Title: | Output Analysis and Diagnostics for MCMC |
| Depends: | R (≥ 2.14.0) |
| Imports: | lattice |
| Description: | Provides functions for summarizing and plotting theoutput from Markov Chain Monte Carlo (MCMC) simulations, aswell as diagnostic tests of convergence to the equilibriumdistribution of the Markov chain. |
| License: | GPL-2 |GPL-3 [expanded from: GPL (≥ 2)] |
| NeedsCompilation: | no |
| Packaged: | 2024-01-31 09:50:53 UTC; ripley |
| Author: | Martyn Plummer [aut, cre, trl], Nicky Best [aut], Kate Cowles [aut], Karen Vines [aut], Deepayan Sarkar [aut], Douglas Bates [aut], Russell Almond [aut], Arni Magnusson [aut] |
| Maintainer: | Martyn Plummer <martyn.plummer@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2024-01-31 09:52:36 UTC |
The Cramer-von Mises Distribution
Description
Distribution function of the Cramer-von Mises distribution.
Usage
pcramer(q, eps)Arguments
q | vector of quantiles. |
eps | accuracy required |
Value
pcramer gives the distribution function,
References
Anderson TW. and Darling DA. Asymptotic theory of certain 'goodnessof fit' criteria based on stochastic processes.Ann. Math.Statist.,23, 192-212 (1952).
Csorgo S. and Faraway, JJ. The exact and asymptotic distributions ofthe Cramer-von Mises statistic. J. Roy. Stat. Soc. (B),58,221-234 (1996).
Highest Posterior Density intervals
Description
Create Highest Posterior Density (HPD) intervals for the parameters inan MCMC sample.
Usage
HPDinterval(obj, prob = 0.95, ...)## S3 method for class 'mcmc'HPDinterval(obj, prob = 0.95, ...)## S3 method for class 'mcmc.list'HPDinterval(obj, prob = 0.95, ...)Arguments
obj | The object containing the MCMC sample - usually of class |
.
prob | A numeric scalar in the interval (0,1) giving the targetprobability content of the intervals. The nominal probabilitycontent of the intervals is the multiple of |
... | Optional additional arguments for methods. None are usedat present. |
Details
For each parameter the interval is constructed from the empirical cdfof the sample as the shortest interval for which the difference inthe ecdf values of the endpoints is the nominal probability. Assumingthat the distribution is not severely multimodal, this is the HPD interval.
Value
For an"mcmc" object, a matrix with columns"lower" and"upper" and rows corresponding to the parameters. Theattribute"Probability" is the nominal probability content ofthe intervals. A list of such matrices is returned for an"mcmc.list" object.
Author(s)
Douglas Bates
Examples
data(line)HPDinterval(line)Coerce mcmc object to time series
Description
theas.ts method formcmc objects coerces an mcmc object to a time series.
Usage
## S3 method for class 'mcmc'as.ts(x,...)Arguments
x | an mcmc object |
... | unused arguments for compatibility with generic |
Author(s)
Martyn Plummer
See Also
Autocorrelation function for Markov chains
Description
autocorr calculates the autocorrelation function for theMarkov chainmcmc.obj at the lags given bylags.The lag values are taken to be relative to the thinning intervalifrelative=TRUE.
High autocorrelations within chains indicate slow mixing and, usually,slow convergence. It may be useful to thin out a chain with highautocorrelations before calculating summary statistics: a thinnedchain may contain most of the information, but take up less space inmemory. Re-running the MCMC sampler with a different parameterizationmay help to reduce autocorrelation.
Usage
autocorr(x, lags = c(0, 1, 5, 10, 50), relative=TRUE)Arguments
x | an mcmc object |
lags | a vector of lags at which to calculate the autocorrelation |
relative | a logical flag. TRUE if lags are relative to the thinninginterval of the chain, or FALSE if they are absolute difference in iterationnumbers |
Value
A vector or array containing the autocorrelations.
Author(s)
Martyn Plummer
See Also
Autocorrelation function for Markov chains
Description
autocorr.diag calculates the autocorrelation function for theMarkov chainmcmc.obj at the lags given bylags.The lag values are taken to be relative to the thinning intervalifrelative=TRUE. Unlikeautocorr, ifmcmc.objhas many parmeters it only computes the autocorrelations with itself andnot the cross correlations. In cases whereautocorr wouldreturn a matrix, this function returns the diagonal of the matrix.Hence it is more useful for chains with many parameters, but may notbe as helpful at spotting parameters.
Ifmcmc.obj is of classmcmc.list then the returnedvector is the average autocorrelation across all chains.
Usage
autocorr.diag(mcmc.obj, ...)Arguments
mcmc.obj | an object of class |
... | optional arguments to be passed to |
Value
A vector containing the autocorrelations.
Author(s)
Russell Almond
See Also
Plot autocorrelations for Markov Chains
Description
Plots the autocorrelation function for each variable in each chain in x.
Usage
autocorr.plot(x, lag.max, auto.layout = TRUE, ask, ...)Arguments
x | A Markov Chain |
lag.max | Maximum value at which to calculate acf |
auto.layout | If |
ask | If |
... | graphical parameters |
See Also
Batch Standard Error
Description
Effective standard deviation of population to produce the correctstandard errors.
Usage
batchSE(x, batchSize=100)Arguments
x | An |
batchSize | Number of observations to include in each batch. |
Details
Because of the autocorrelation, the usual method of takingvar(x)/n overstates the precision of the estimate. This methodworks around the problem by looking at the means of batches of theparameter. If the batch size is large enough, the batch means shouldbe approximately uncorrelated and the normal formula for computing thestandard error should work.
The batch standard error procedure is usually thought to be not asaccurate as the time series methods used insummary andeffectiveSize. It is included here for completeness.
Value
A vector giving the standard error for each column ofx.
Author(s)
Russell Almond
References
Roberts, GO (1996) Markov chain concepts related to sampling algorithms,in Gilks, WR, Richardson, S and Spiegelhalter, DJ,Markov ChainMonte Carlo in Practice, Chapman and Hall, 45-58.
See Also
spectrum0.ar,effectiveSize,summary.mcmc
Convert WinBUGS data file to JAGS data file
Description
bugs2jags converts a WinBUGS data in the format called "S-Plus"(i.e. the format created by thedput function) and writes it indump format used by JAGS.
NB WinBUGS stores its arrays in row order. This is differentfrom R and JAGS which both store arrays in column order. Thisdifference is taken into account bybugs2jags which willautomatically reorder the data in arrays, without changing thedimension.
Not yet available in S-PLUS.
Usage
bugs2jags(infile, outfile)Arguments
infile | name of the input file |
outfile | name of the output file |
Note
If the input file is saved from WinBUGS, it must be saved in plain textformat. The default format for files saved from WinBUGS is a binarycompound document format (with extension odc) that cannot be read bybugs2jags.
Author(s)
Martyn Plummer
References
Spiegelhalter DJ, Thomas A, Best NG and Lunn D (2003).WinBUGS version 1.4 user manualMRC Biostatistics Unit, Cambridge, UK.
See Also
Options settings for the codamenu driver
Description
coda.options is a utility function that queries and setsoptions for thecodamenu() function. These settings affectthe behaviour of the functions in the coda library only when theyare called via thecodamenu() interface.
Thecoda.options() function behaves just like theoptions() function in the base library, with the additionalfeature thatcoda.options(default=TRUE) will reset all optionsto the default values.
Options can be pretty-printed using thedisplay.coda.options()function, which groups the options into sections.
Available options are
- bandwidth
Bandwidth function used when smoothing samples toproduce density estimates. Defaults to Silverman's “Rule of thumb”.
- combine.corr
Logical option that determines whether tocombine multiple chains when calculating cross-correlations.
- combine.plots
Logical option that determines whether tocombine multiple chains when plotting.
- combine.plots
Logical option that determines whether tocombine multiple chains when calculating summary statistics.
- data.saved
For internal use only.
- densplot
Logical option that determines whether to plota density plot when plot methods are called for mcmc objects.
- digits
Number of significant digits to use when printing.
- frac1
For Geweke diagnostic, fraction to use fromstart of chain. Defaults to 0.1
- frac2
For Geweke diagnostic, fraction to use fromend of chain. Default to 0.5.
- gr.bin
For Geweke-Brooks plot, number of iterations to useper bin.
- gr.max
For Geweke-Brooks plot, maximum number of bins touse. This option overrides
gr.bin.- halfwidth
For Heidelberger and Welch diagnostic, the targetvalue for the ratio of half width to sample mean.
- lowess
Logical option that controls whether to plot a smooth line through a trace plot when plotting MCMC output.
- q
For Raftery and Lewis diagnostic, the target quantile tobe estimated
- r
For Raftery and Lewis diagnostic, the required precision.
- s
For Raftery and Lewis diagnostic, the probability ofobtaining an estimate in the interval (q-r, q+r).
- quantiles
Vector of quantiles to print when calculatingsummary statistics for MCMC output.
- trace
Logical option that determines whether to plot a traceof the sampled output when plotting MCMC output.
- user.layout
Logical option that determines whether currentvalue of par("mfrow") should be used for plots (TRUE) or whetherthe optimal layout should be calculated (FALSE).
Usage
coda.options(...)display.coda.options(stats = FALSE, plots = FALSE, diags = FALSE)Arguments
stats | logical flag: show summary statistic options? |
plots | logical flag: show plotting options? |
diags | logical flag: show plotting options? |
... | list of options |
See Also
Main menu driver for the coda package
Description
codamenu presents a simple menu-based interface to the functionsin the coda package. It is designed for users who know nothing aboutthe R/S language.
Usage
codamenu()Author(s)
Kate Cowles, Nicky Best, Karen Vines, Martyn Plummer
Cross correlations for MCMC output
Description
crosscorr calculates cross-correlations betweenvariables in Markov Chain Monte Carlo output. Ifxis an mcmc.list then all chains inx are combinedbefore calculating the correlation.
Usage
crosscorr(x)Arguments
x | an |
Value
A matrix or 3-d array containing the correlations.
See Also
Plot image of correlation matrix
Description
crosscorr.plot provides an image of the correlation matrix forx. Ifx is anmcmc.list object, then all chainsare combined.
The range [-1,1] is divided into a number of equal-length categoriesgiven by the length ofcol and assigned the corresponding color.By default, topographic colours are used as this makes it easier todistinguish positive and negative correlations.
Usage
crosscorr.plot (x, col = topo.colors(10), ...)Arguments
x | an |
col | color palette to use |
... | graphical parameters |
See Also
Cumulative quantile plot
Description
Plots the evolution of the sample quantiles as a function of thenumber of iterations.
Usage
cumuplot(x, probs=c(0.025,0.5,0.975), ylab="", lty=c(2,1), lwd=c(1,2), type="l", ask, auto.layout=TRUE, col=1, ...)Arguments
x | an |
probs | vector of desired quantiles |
ylab,lty,lwd,type,col | graphical parameters |
auto.layout | If |
ask | If |
... | further graphical parameters |
Author(s)
Arni Magnusson
Probability density function estimate from MCMC output
Description
Displays a plot of the density estimate for each variable inx,calculated by thedensity function. For discrete-valuedvariables, a histogram is produced.
Usage
densplot(x, show.obs = TRUE, bwf, ylim, xlab, ylab = "", type="l", main, right=TRUE, ...)Arguments
x | An |
show.obs | Show observations along the x-axis |
bwf | Function for calculating the bandwidth. If omitted,the bandwidth is calculate by 1.06 times the minimum of the standarddeviation and the interquartile range divided by 1.34 times the samplesize to the negative one fifth power |
ylim | Limits on y axis. See |
xlab | X-axis label. By default this will show the sample sizeand the bandwidth used for smoothing. See |
ylab | Y-axis label. By default, this is blank. See |
type | Plot type. See |
main | An overall title for the plot. See |
right | Logical flag for discrete-valued distributions passed tothe |
.
... | Further graphical parameters |
Details
For discrete-valued distributions, a histogram is produced and valuesare aggregated using the pretty() function. By default, tick marksappear to the right of the corresponding bar in the histogram and givethe inclusive upper limit of the hist (right=TRUE). This can bemodified by specifyingright=FALSE. In this case tick marksappear on the left and specify the inclusive lower limit of the bar.
For continous distributions, if a variable is bounded below at 0, orbounded in the interval [0,1], then the data are reflected at theboundary before being passed to the density() function. This allowscorrect estimation of a non-zero density at the boundary.
Note
You can call this function directly, but it is more usually called bytheplot.mcmc function.
See Also
Effective sample size for estimating the mean
Description
Sample size adjusted for autocorrelation.
Usage
effectiveSize(x)Arguments
x | An mcmc or mcmc.list object. |
Details
For a time seriesx of lengthN, the standard error of themean is the square root ofvar(x)/n wheren is theeffective sample size.n = N only when there is noautocorrelation.
Estimation of the effective sample size requires estimating thespectral density at frequency zero. This is done by the functionspectrum0.ar
For amcmc.list object, the effective sizes are summed acrosschains. To get the size for each chain individually uselapply(x,effectiveSize).
Value
A vector giving the effective sample size for each column ofx.
See Also
Gelman and Rubin's convergence diagnostic
Description
The ‘potential scale reduction factor’ is calculated for each variable inx, together with upper and lower confidence limits. Approximateconvergence is diagnosed when the upper limit is close to 1. Formultivariate chains, a multivariate value is calculated that boundsabove the potential scale reduction factor for any linear combinationof the (possibly transformed) variables.
The confidence limits are based on the assumption that the stationarydistribution of the variable under examination is normal. Hence the‘transform’ parameter may be used to improve the normal approximation.
Usage
gelman.diag(x, confidence = 0.95, transform=FALSE, autoburnin=TRUE, multivariate=TRUE)Arguments
x | An |
confidence | the coverage probability of the confidence interval for thepotential scale reduction factor |
transform | a logical flag indicating whether variables in |
autoburnin | a logical flag indicating whether only the second halfof the series should be used in the computation. If set to TRUE(default) and |
multivariate | a logical flag indicating whether the multivariatepotential scale reduction factor should be calculated for multivariatechains |
Value
An object of classgelman.diag. This is a list with thefollowing elements:
psrf | A list containing the point estimates of the potentialscale reduction factor (labelled |
mpsrf | The point estimate of the multivariate potential scale reductionfactor. This is NULL if there is only one variable in |
Thegelman.diag class has its ownprint method.
Theory
Gelman and Rubin (1992) propose a general approach to monitoringconvergence of MCMC output in whichm > 1 parallel chains are runwith starting values that are overdispersed relative to the posteriordistribution. Convergence is diagnosed when the chains have ‘forgotten’their initial values, and the output from all chains isindistinguishable. Thegelman.diag diagnostic is applied to asingle variable from the chain. It is based a comparison of within-chainand between-chain variances, and is similar to a classical analysis ofvariance.
There are two ways to estimate the variance of the stationary distribution:the mean of the empirical variance within each chain,W, andthe empirical variance from all chains combined, which can be expressed as
\widehat{\sigma}^2 = \frac{(n-1) W }{n} + \frac{B}{n}
wheren is the number of iterations andB/n is the empiricalbetween-chain variance.
If the chains have converged, then both estimates areunbiased. Otherwise the first method willunderestimate thevariance, since the individual chains have not had time to range allover the stationary distribution, and the second method willoverestimate the variance, since the starting points were chosento be overdispersed.
The convergence diagnostic is based on the assumption that thetarget distribution is normal. A Bayesian credible interval canbe constructed using a t-distribution with mean
\widehat{\mu}=\mbox{Sample mean of all chainscombined}
and variance
\widehat{V}=\widehat{\sigma}^2 + \frac{B}{mn}
and degrees of freedom estimated by the method of moments
d = \frac{2*\widehat{V}^2}{\mbox{Var}(\widehat{V})}
Use of the t-distribution accounts for the fact that the meanand variance of the posterior distribution are estimated.
The convergence diagnostic itself is
R=\sqrt{\frac{(d+3) \widehat{V}}{(d+1)W}}
Values substantially above 1 indicate lack of convergence. If thechains have not converged, Bayesian credible intervals based on thet-distribution are too wide, and have the potential to shrink by thisfactor if the MCMC run is continued.
Note
The multivariate a version of Gelman and Rubin's diagnostic wasproposed by Brooks and Gelman (1998). Unlike the univariate proportionalscale reduction factor, the multivariate version does not include anadjustment for the estimated number of degrees of freedom.
References
Gelman, A and Rubin, DB (1992) Inference from iterative simulationusing multiple sequences,Statistical Science,7, 457-511.
Brooks, SP. and Gelman, A. (1998) General methods for monitoringconvergence of iterative simulations.Journal of Computational andGraphical Statistics,7, 434-455.
See Also
Gelman-Rubin-Brooks plot
Description
This plot shows the evolution of Gelman and Rubin's shrink factor asthe number of iterations increases.
Usage
gelman.plot(x, bin.width = 10, max.bins = 50,confidence = 0.95, transform = FALSE, autoburnin=TRUE, auto.layout = TRUE,ask, col, lty, xlab, ylab, type, ...)Arguments
x | an mcmc object |
bin.width | Number of observations per segment, excluding thefirst segment which always has at least 50 iterations. |
max.bins | Maximum number of bins, excluding the last one. |
confidence | Coverage probability of confidence interval. |
transform | Automatic variable transformation (see |
autoburnin | Remove first half of sequence (see |
auto.layout | If |
ask | Prompt user before displaying each page of plots. Default is |
col | graphical parameter (see |
lty | graphical parameter (see |
xlab | graphical parameter (see |
ylab | graphical parameter (see |
type | graphical parameter (see |
... | further graphical parameters. |
Details
The Markov chain is divided into bins according to the argumentsbin.width andmax.bins. Then the Gelman-Rubin shrink factoris repeatedly calculated. The first shrink factor is calculated withobservations 1:50, the second with observations1:(50+bin.width),the third contains samples1:(50+2*bin.width) and so on.If the chain has less than50 + bin.width iterations thengelman.diag will exit with an error.
Theory
A potential problem withgelman.diag is that it may mis-diagnoseconvergence if the shrink factor happens to be close to 1 by chance.By calculating the shrink factor at several points in time,gelman.plot shows if the shrink factor has really converged, orwhether it is still fluctuating.
References
Brooks, S P. and Gelman, A. (1998) General Methods for MonitoringConvergence of Iterative Simulations.Journal of Computational andGraphical Statistics,7, 434-455.
See Also
Geweke's convergence diagnostic
Description
Geweke (1992) proposed a convergence diagnostic for Markov chainsbased on a test for equality of the means of the first and last partof a Markov chain (by default the first 10% and the last 50%). If thesamples are drawn from the stationary distribution of the chain, the twomeans are equal and Geweke's statistic has an asymptotically standardnormal distribution.
The test statistic is a standard Z-score: the difference between thetwo sample means divided by its estimated standard error. The standarderror is estimated from the spectral density at zero and so takes intoaccount any autocorrelation.
The Z-score is calculated under the assumption that the two parts ofthe chain are asymptotically independent, which requires that the sumoffrac1 andfrac2 be strictly less than 1.
Usage
geweke.diag(x, frac1=0.1, frac2=0.5)Arguments
x | an mcmc object |
frac1 | fraction to use from beginning of chain |
frac2 | fraction to use from end of chain |
Value
Z-scores for a test of equality of meansbetween the first and last parts of the chain. A separatestatistic is calculated for each variable in each chain.
References
Geweke, J. Evaluating the accuracy of sampling-based approachesto calculating posterior moments. InBayesian Statistics 4(ed JM Bernado, JO Berger, AP Dawid and AFM Smith). Clarendon Press,Oxford, UK.
See Also
Geweke-Brooks plot
Description
Ifgeweke.diag indicates that the first and last part of a samplefrom a Markov chain are not drawn from the same distribution, it maybe useful to discard the first few iterations to see if the rest of thechain has "converged". This plot shows what happens to Geweke's Z-scorewhen successively larger numbers of iterations are discarded from thebeginning of the chain. To preserve the asymptotic conditions requiredfor Geweke's diagnostic, the plot never discards more than half the chain.
The first half of the Markov chain is divided intonbins - 1segments, then Geweke's Z-score is repeatedly calculated. The firstZ-score is calculated with all iterations in the chain, the secondafter discarding the first segment, the third after discarding the firsttwo segments, and so on. The last Z-score is calculated using only thesamples in the second half of the chain.
Usage
geweke.plot(x, frac1 = 0.1, frac2 = 0.5, nbins = 20, pvalue = 0.05, auto.layout = TRUE, ask, ...)Arguments
x | an mcmc object |
frac1 | fraction to use from beginning of chain. |
frac2 | fraction to use from end of chain. |
nbins | Number of segments. |
pvalue | p-value used to plot confidence limits for the null hypothesis. |
auto.layout | If |
ask | If |
... | Graphical parameters. |
Note
The graphical implementation of Geweke's diagnostic was suggestedby Steve Brooks.
See Also
Heidelberger and Welch's convergence diagnostic
Description
heidel.diag is a run length control diagnostic based on a criterionof relative accuracy for the estimate of the mean. The default settingcorresponds to a relative accuracy of two significant digits.
heidel.diag also implements a convergence diagnostic, and removesup to half the chain in order to ensure that the means are estimatedfrom a chain that has converged.
Usage
heidel.diag(x, eps=0.1, pvalue=0.05)Arguments
x | An |
eps | Target value for ratio of halfwidth to sample mean |
pvalue | significance level to use |
Details
The convergence test uses the Cramer-von-Mises statistic to test the nullhypothesis that the sampled values come from a stationary distribution.The test is successively applied, firstly to the whole chain, then afterdiscarding the first 10%, 20%, ... of the chain until either the nullhypothesis is accepted, or 50% of the chain has been discarded. The latteroutcome constitutes ‘failure’ of the stationarity test and indicatesthat a longer MCMC run is needed. If the stationarity test is passed,the number of iterations to keep and the number to discard are reported.
The half-width test calculates a 95% confidence interval for themean, using the portion of the chain which passed the stationarity test.Half the width of this interval is compared with the estimate of the mean.If the ratio between the half-width and the mean is lower thaneps,the halfwidth test is passed. Otherwise the length of the sample isdeemed not long enough to estimate the mean with sufficient accuracy.
Theory
Theheidel.diag diagnostic is based on the work of Heidelbergerand Welch (1983), who combined their earlier work on simulation runlength control (Heidelberger and Welch, 1981) with the work of Schruben(1982) on detecting initial transients using Brownian bridge theory.
Note
If the half-width test fails then the run should be extended. In orderto avoid problems caused by sequential testing, the test should notbe repeated too frequently. Heidelberger and Welch (1981) suggestincreasing the run length by a factor I > 1.5, each time, so that estimate has the same, reasonably large, proportion of new data.
References
Heidelberger P and Welch PD. A spectral method for confidence intervalgeneration and run length control in simulations. Comm. ACM.24,233-245 (1981)
Heidelberger P and Welch PD. Simulation run length control in the presence of an initial transient.Opns Res.,31,1109-44 (1983)
Schruben LW. Detecting initialization bias in simulation experiments.Opns. Res.,30, 569-590 (1982).
Simple linear regression example
Description
Sample MCMC output from a simple linear regression model given inthe BUGS manual.
Usage
data(line)Format
Anmcmc object
Source
Spiegelhalter, D.J., Thomas, A., Best, N.G. and Gilks, W.R. (1995)BUGS: Bayesian inference using Gibbs Sampling, Version 0.5, MRC Biostatistics Unit, Cambridge.
Markov Chain Monte Carlo Objects
Description
The functionmcmc is used to create a Markov Chain Monte Carloobject. The input data are taken to be a vector, or a matrix withone column per variable.
If the optional argumentsstart,end, andthinare omitted then the chain is assumed to start with iteration 1 andhave thinning interval 1. Ifdata represents a chain thatstarts at a later iteration, the first iteration in the chain shouldbe given as thestart argument. Likewise, ifdatarepresents a chain that has already been thinned, the thinninginterval should be given as thethin argument.
An mcmc object may be summarized by thesummary functionand visualized with theplot function.
MCMC objects resemble time series (ts) objects and havemethods for the generic functionstime,start,end,frequency andwindow.
Usage
mcmc(data= NA, start = 1, end = numeric(0), thin = 1)as.mcmc(x, ...)is.mcmc(x)Arguments
data | a vector or matrix of MCMC output |
start | the iteration number of the first observation |
end | the iteration number of the last observation |
thin | the thinning interval between consecutive observations |
x | An object that may be coerced to an mcmc object |
... | Further arguments to be passed to specific methods |
Note
The format of the mcmc class has changed between coda version 0.3and 0.4. Older mcmc objects will now causeis.mcmc tofail with an appropriate warning message. Obsolete mcmc objects canbe upgraded with themcmcUpgrade function.
Author(s)
Martyn Plummer
See Also
mcmc.list,mcmcUpgrade,thin,window.mcmc,summary.mcmc,plot.mcmc.
Conversions of MCMC objects
Description
These are methods for the generic functionsas.matrix(),as.array() andas.mcmc().
as.matrix() strips the MCMC attributes from anmcmcobject and returns a matrix. Ifiters = TRUE then acolumn is added with the iteration number. Formcmc.listobjects, the rows of multiple chains are concatenated and, ifchains = TRUE a column is added with the chain number.
mcmc.list objects can be coerced to 3-dimensional arrayswith theas.array() function.
Anmcmc.list object with a single chain can be coercedto anmcmc object withas.mcmc(). If the argumenthas multiple chains, this causes an error.
Usage
## S3 method for class 'mcmc'as.matrix(x, iters = FALSE, ...)## S3 method for class 'mcmc.list'as.matrix(x, iters = FALSE, chains = FALSE, ...)## S3 method for class 'mcmc.list'as.array(x, drop, ...)Arguments
x | An |
iters | logical flag: add column for iteration number? |
chains | logical flag: add column for chain number? (if mcmc.list) |
drop | logical flag: if |
... | optional arguments to the various methods |
See Also
Replicated Markov Chain Monte Carlo Objects
Description
The function ‘mcmc.list’ is used to represent parallel runs of the samechain, with different starting values and random seeds. The list mustbe balanced: each chain in the list must have the same iterations andthe same variables.
Diagnostic functions which act onmcmc objects may also be appliedtomcmc.list objects. In general, the chains will be combined,if this makes sense, otherwise the diagnostic function will be appliedseparately to each chain in the list.
Since all the chains in the list have the same iterations, a single timedimension can be ascribed to the list. Hence there are time series methodstime,window,start,end,frequencyandthin formcmc.list objects.
Anmcmc.list can be indexed as if it were a single mcmc objectusing the[ operator (see examples below). The[[ operatorselects a singlemcmc object from the list.
Usage
mcmc.list(...)as.mcmc.list(x, ...)is.mcmc.list(x)Arguments
... | a list of mcmc objects |
x | an object that may be coerced to mcmc.list |
Author(s)
Martyn Plummer
See Also
mcmc.
Examples
data(line)x1 <- line[[1]] #Select first chainx2 <- line[,1, drop=FALSE] #Select first var from all chainsvarnames(x2) == varnames(line)[1] #TRUEExtract or replace parts of MCMC objects
Description
These are methods for subsettingmcmc objects. You can selectiterations using the first dimension and variables using the seconddimension. Selecting iterations will return a vector or matrix, not anmcmc object. If you want to do row-subsetting of anmcmcobject and preserve its dimensions, use thewindow function.
Subsetting applied to anmcmc.list object will simultaneously affect all the parallel chains in the object.
Usage
## S3 method for class 'mcmc'x[i,j, drop=missing(i)]## S3 method for class 'mcmc.list'x[i,j, drop=TRUE]Arguments
x | An |
i | Row to extract |
j | Column to extract |
drop | if |
See Also
Upgrade mcmc objects in obsolete format
Description
In previous releases of CODA, anmcmc object couldbe a single or multiple chains. A new classmcmc.listhas now been introduced to deal with multiple chains andmcmc objects can only have data from a single chain.
Objects stored in the old format are now obsoleteand must be upgraded.
Usage
mcmcUpgrade(x)Arguments
x | an obsolete |
Author(s)
Martyn Plummer
See Also
mcmc.
Mcpar attribute of MCMC objects
Description
The ‘mcpar’ attribute of an MCMC object gives the start iterationthe end iteration and the thinning interval of the chain.
It resembles the ‘tsp’ attribute of time series (ts) objects.
Usage
mcpar(x)Arguments
x | An |
See Also
Choose multiple options from a menu
Description
multi.menu presents the user with a menu of choiceslabelled from 1 to the number of choices. The user may chooseone or more options by entering a comma separated list. A rangeof values may also be specified using the ":" operator. Mixedexpressions such as "1,3:5, 6" are permitted.
Ifallow.zero is set to TRUE, one can select ‘0’ to exitwithout choosing an item.
Usage
multi.menu(choices, title, header, allow.zero = TRUE)Arguments
choices | Character vector of labels for choices |
title | Title printed before menu |
header | Character vector of length 2 giving column titles |
allow.zero | Permit 0 as an acceptable response |
Value
Numeric vector giving the numbers of the options selected, or0 if no selection is made.
Author(s)
Martyn Plummer
See Also
menu.
Dimensions of MCMC objects
Description
These functions give the dimensions of an MCMC object
- niter(x)
returns the number of iterations.
- nvar(x)
returns the number of variables.
- nchain(x)
returns the number of parallel chains.
Usage
niter(x)nvar(x)nchain(x)Arguments
x | An |
Value
A numeric vector of length 1:
See Also
Summary plots of mcmc objects
Description
plot.mcmc summarizes an mcmc or mcmc.list objectwith a trace of the sampled output and a density estimatefor each variable in the chain.
Usage
## S3 method for class 'mcmc'plot(x, trace = TRUE, density = TRUE, smooth = FALSE, bwf,auto.layout = TRUE, ask = dev.interactive(), ...)Arguments
x | an object of class |
trace | Plot trace of each variable |
density | Plot density estimate of each variable |
smooth | Draw a smooth line through trace plots |
bwf | Bandwidth function for density plots |
auto.layout | Automatically generate output format |
ask | Prompt user before each page of plots |
... | Further arguments |
Author(s)
Martyn Plummer
See Also
Raftery and Lewis's diagnostic
Description
raftery.diag is a run length control diagnostic based on acriterion of accuracy of estimation of the quantileq. It isintended for use on a short pilot run of a Markov chain. The numberof iterations required to estimate the quantileq to within anaccuracy of +/-r with probabilityp is calculated. Separatecalculations are performed for each variable within each chain.
If the number of iterations indata is too small,an error message is printed indicating the minimum length ofpilot run. The minimum length is the required sample size for achain with no correlation between consecutive samples. Positiveautocorrelation will increase the required sample size above thisminimum value. An estimateI (the ‘dependence factor’) of theextent to which autocorrelation inflates the required sample sizeis also provided. Values ofI larger than 5 indicate strongautocorrelation which may be due to a poor choice of starting value,high posterior correlations or ‘stickiness’ of the MCMC algorithm.
The number of ‘burn in’ iterations to be discarded at the beginningof the chain is also calculated.
Usage
raftery.diag(data, q=0.025, r=0.005, s=0.95, converge.eps=0.001)Arguments
data | an |
q | the quantile to be estimated. |
r | the desired margin of error of the estimate. |
s | the probability of obtaining an estimate in the interval (q-r,q+r). |
converge.eps | Precision required for estimate of time to convergence. |
Value
A list with classraftery.diag. A print method is availablefor objects of this class. the contents of the list are
tspar | The time series parameters of |
params | A vector containing the parameters |
Niters | The number of iterations in |
resmatrix | A 3-d array containing the results: |
Theory
The estimated sample size for variable U is based on the processZ_t = d(U_t <= u) whered is the indicator function and u is theqth quantile of U. The processZ_t is derived from the Markovchaindata by marginalization and truncation, but is not itselfa Markov chain. However,Z_t may behave as a Markov chain ifit is sufficiently thinned out.raftery.diag calculates thesmallest value of thinning intervalk which makes the thinnedchainZ^k_t behave as a Markov chain. The required sample size iscalculated from this thinned sequence. Since some data is ‘thrown away’the sample size estimates are conservative.
The criterion for the number of ‘burn in’ iterationsm to bediscarded, is that the conditional distribution ofZ^k_mgivenZ_0 should be withinconverge.eps of the equilibriumdistribution of the chainZ^k_t.
Note
raftery.diag is based on the FORTRAN program ‘gibbsit’ written by Steven Lewis, and available from the Statlib archive.
References
Raftery, A.E. and Lewis, S.M. (1992). One long run with diagnostics:Implementation strategies for Markov chain Monte Carlo.Statistical Science,7, 493-497.
Raftery, A.E. and Lewis, S.M. (1995). The number of iterations,convergence diagnostics and generic Metropolis algorithms.InPractical Markov Chain Monte Carlo (W.R. Gilks, D.J. Spiegelhalterand S. Richardson, eds.). London, U.K.: Chapman and Hall.
Read data interactively and check that it satisfies conditions
Description
Input is read interactively and checked against conditions specifiedby the argumentswhat,lower,upper andanswer.in. If the input does not satisfy all the conditions,an appropriate error message is produced and the user is promptedto provide input. This process is repeated until a valid input valueis entered.
Usage
read.and.check(message = "", what = numeric(), lower, upper, answer.in,default)Arguments
message | message displayed before prompting for user input. |
what | the type of |
lower | lower limit of input, for numeric input only. |
upper | upper limit of input, for numeric input only. |
answer.in | the input must correspond to one of the elements of thevector |
default | value assumed if user enters a blank line. |
Value
The value of the valid input. When thedefault argument isspecified, a blank line is accepted as valid input and in this caseread.and.check returns the value ofdefault.
Note
Since the function does not return a value until it receives validinput, it extensively checks the conditions for consistency beforeprompting the user for input. Inconsistent conditions will causean error.
Author(s)
Martyn Plummer
Read output files in CODA format
Description
read.coda reads Markov Chain Monte Carlo output inthe CODA format produced by OpenBUGS and JAGS. By default, allof the data in the file is read, but the argumentsstart,end andthin may be used to read a subset of thedata. If the arguments given tostart,end orthin are incompatible with the data, they are ignored.
Usage
read.coda(output.file, index.file, start, end, thin, quiet=FALSE)read.jags(file = "jags.out", start, end, thin, quiet=FALSE)Arguments
output.file | The name of the file containing the monitoredoutput |
index.file | The name of the file containing the index, showingwhich rows of the output file correspond to which variables |
file | For JAGS output, the name of the output file. Theextension ".out" may be omitted. There must be a corresponding".ind" file with the same file stem. |
start | First iteration of chain |
end | Last iteration of chain |
thin | Thinning interval for chain |
quiet | Logical flag. If true, a progress summary will be printed |
Value
An object of classmcmc containing a representation of the data in the file.
Author(s)
Karen Vines, Martyn Plummer
References
Spiegelhalter DJ, Thomas A, Best NG and Gilks WR (1995).BUGS: Bayesian inference Using Gibbs Sampling, Version 0.50.MRC Biostatistics Unit, Cambridge.
See Also
mcmc,read.coda.interactive,read.openbugs.
Read CODA output files interactively
Description
read.coda.interactive reads Markov Chain Monte Carlo outputin the format produced by the classic BUGS program. No arguments arerequired. Instead, the user is prompted for the required information.
Usage
read.coda.interactive()Value
An object of classmcmc.list containing a representation of the data in one or more BUGS output files.
Note
This function is normally called by thecodamenu function,but can also be used on a stand-alone basis.
Author(s)
Nicky Best, Martyn Plummer
References
Spiegelhalter DJ, Thomas A, Best NG and Gilks WR (1995).BUGS: Bayesian inference Using Gibbs Sampling, Version 0.50.MRC Biostatistics Unit, Cambridge.
See Also
mcmc,mcmc.list,read.coda,codamenu.
Read CODA output files produced by OpenBUGS
Description
read.openbugs reads Markov Chain Monte Carlo output inthe CODA format produced by OpenBUGS.
This is a convenience wrapper around the functionread.codawhich allows you to read all the data output by OpenBUGS byspecifying only the file stem.
Usage
read.openbugs(stem="", start, end, thin, quiet=FALSE)Arguments
stem | Character string giving the stem for the output files.OpenBUGS produces files with names "<stem>CODAindex.txt","<stem>CODAchain1.txt", "<stem>CODAchain2.txt", ... |
start | First iteration of chain |
end | Last iteration of chain |
thin | Thinning interval for chain |
quiet | Logical flag. If true, a progress summary will be printed |
Value
An object of classmcmc.list containing output from allchains.
Author(s)
Martyn Plummer
References
Spiegelhalter DJ, Thomas A, Best NG and Lunn D (2004).WinBUGS User Manual, Version 2.0, June 2004,MRC Biostatistics Unit, Cambridge.
See Also
Rejection Rate for Metropolis–Hastings chains
Description
rejectionRate calculates the fraction of time that aMetropolis–Hastings type chain rejected a proposed move. The rejectionrate is calculates separately for each variable in themcmc.objargument, irregardless of whether the variables were drawn separately orin a block. In the latter case, the values returned should be thesame.
Usage
rejectionRate(x)Arguments
x | An |
Details
For the purposes of this function, a "rejection" has occurred if thevalue of the time series is the same at two successive time points.This test is done naively using== and may produce problems dueto rounding error.
Value
A vector containing the rejection rates, one for each variable.
Author(s)
Russell Almond
Estimate spectral density at zero
Description
The spectral density at frequency zero is estimated by fitting a glm tothe low-frequency end of the periodogram.spectrum0(x)/length(x)estimates the variance ofmean(x).
Usage
spectrum0(x, max.freq = 0.5, order = 1, max.length = 200)Arguments
x | A time series. |
max.freq | The glm is fitted on the frequency range (0, max.freq] |
order | Order of the polynomial to fit to the periodogram. |
max.length | The data |
Details
The raw periodogram is calculated for the seriesx and a generalizedlinear model with familyGamma and log link is fitted tothe periodogram.
The linear predictor is a polynomial in terms of the frequency. Thedegree of the polynomial is determined by the parameterorder.
Value
A list with the following values
spec | The predicted value of the spectral density at frequency zero. |
Theory
Heidelberger and Welch (1991) observed that the usual non-parametricestimator of the spectral density, obtained by smoothing the periodogram,is not appropriate for frequency zero. They proposed an alternativeparametric method which consisted of fitting a linear model to thelog periodogram of the batched time series. Some technical problems with model fitting in their original proposal can be overcome by usinga generalized linear model.
Batching of the data, originally proposed in order to save space, has theside effect of flattening the spectral density and making a polynomialfit more reasonable. Fitting a polynomial of degree zero is equivalentto using the ‘batched means’ method.
Note
The definition of the spectral density used here differs from that used byspec.pgram. We consider the frequency range to be between 0 and 0.5,not between 0 andfrequency(x)/2.
The model fitting may fail on chains with very high autocorrelation.
References
Heidelberger, P and Welch, P.D. A spectral method for confidence intervalgeneration and run length control in simulations. Communications of theACM, Vol 24, pp233-245, 1981.
See Also
Estimate spectral density at zero
Description
The spectral density at frequency zero is estimated by fitting anautoregressive model.spectrum0(x)/length(x) estimates thevariance ofmean(x).
Usage
spectrum0.ar(x)Arguments
x | A time series. |
Details
Thear() function to fit an autoregressive model to the timeseries x. For multivariate time series, separate models are fitted foreach column. The value of the spectral density at zero is then givenby a well-known formula.
Value
A list with the following values
spec | The predicted value of the spectral density at frequency zero. |
order | The order of the fitted model |
Note
The definition of the spectral density used here differs from that used byspec.pgram. We consider the frequency range to be between 0 and 0.5,not between 0 andfrequency(x)/2.
See Also
Summary statistics for Markov Chain Monte Carlo chains
Description
summary.mcmc produces two sets of summary statistics foreach variable:
Mean, standard deviation, naive standard error of the mean(ignoring autocorrelation of the chain) and time-series standarderror based on an estimate of the spectral density at 0.
Quantiles of the sample distribution using thequantilesargument.
Usage
## S3 method for class 'mcmc'summary(object, quantiles = c(0.025, 0.25, 0.5, 0.75, 0.975), ...)Arguments
object | an object of class |
quantiles | a vector of quantiles to evaluate for each variable |
... | a list of further arguments |
Author(s)
Martyn Plummer
See Also
Thinning interval
Description
thin returns the interval between successivevalues of a time series.thin(x) is equivalentto1/frequency(x).
This is a generic function. Methods have been implementedformcmc objects.
Usage
thin(x, ...)Arguments
x | a regular time series |
... | a list of arguments |
Author(s)
Martyn Plummer
See Also
time.
Time attributes for mcmc objects
Description
These are methods for mcmc objects for the generic timeseries functions.
Usage
## S3 method for class 'mcmc'time(x, ...)## S3 method for class 'mcmc'start(x, ...)## S3 method for class 'mcmc'end(x, ...)## S3 method for class 'mcmc'thin(x, ...)Arguments
x | an |
... | extra arguments for future methods |
See Also
Trace plot of MCMC output
Description
Displays a plot of iterationsvs. sampled values for each variablein the chain, with a separate plot per variable.
Usage
traceplot(x, smooth = FALSE, col = 1:6, type = "l", xlab = "Iterations", ylab = "", ...)Arguments
x | An |
smooth | draw smooth line through trace plot |
col | graphical parameter (see |
type | graphical parameter (see |
xlab | graphical parameter (see |
ylab | graphical parameter (see |
... | further graphical parameters |
Note
You can call this function directly, but it is more usuallycalled by theplot.mcmc function.
See Also
Trellis plots for mcmc objects
Description
These methods use the Trellis framework as implemented in thelattice package to produce space-conserving diagnostic plotsfrom"mcmc" and"mcmc.list" objects. Thexyplotmethods produce trace plots. Thedensityplot methods andqqmath methods produce empirical density and probabilityplots. Thelevelplot method depicts the correlation of theseries. Theacfplot methods plot the auto-correlation in theseries.
Not yet available in S-PLUS.
Usage
## S3 method for class 'mcmc'densityplot(x, data, outer, aspect = "xy", default.scales = list(relation = "free"), start = 1, thin = 1, main = attr(x, "title"), xlab = "", plot.points = "rug", ..., subset)## S3 method for class 'mcmc.list'densityplot(x, data, outer = FALSE, groups = !outer, aspect = "xy", default.scales = list(relation = "free"), start = 1, thin = 1, main = attr(x, "title"), xlab = "", plot.points = "rug", ..., subset)## S3 method for class 'mcmc'levelplot(x, data, main = attr(x, "title"), start = 1, thin = 1, ..., xlab = "", ylab = "", cuts = 10, at, col.regions = topo.colors(100), subset)## S3 method for class 'mcmc'qqmath(x, data, outer, aspect = "xy", default.scales = list(y = list(relation = "free")), prepanel = prepanel.qqmathline, start = 1, thin = 1, main = attr(x, "title"), ylab = "", ..., subset)## S3 method for class 'mcmc.list'qqmath(x, data, outer = FALSE, groups = !outer, aspect = "xy", default.scales = list(y = list(relation = "free")), prepanel = prepanel.qqmathline, start = 1, thin = 1, main = attr(x, "title"), ylab = "", ..., subset)## S3 method for class 'mcmc'xyplot(x, data, outer, layout = c(1, nvar(x)), default.scales = list(y = list(relation = "free")), type = 'l', start = 1, thin = 1, xlab = "Iteration number", ylab = "", main = attr(x, "title"), ..., subset)## S3 method for class 'mcmc.list'xyplot(x, data, outer = FALSE, groups = !outer, aspect = "xy", layout = c(1, nvar(x)), default.scales = list(y = list(relation = "free")), type = 'l', start = 1, thin = 1, xlab = "Iteration number", ylab = "", main = attr(x, "title"), ..., subset)acfplot(x, data, ...)## S3 method for class 'mcmc'acfplot(x, data, outer, prepanel, panel, type = 'h', aspect = "xy", start = 1, thin = 1, lag.max = NULL, ylab = "Autocorrelation", xlab = "Lag", main = attr(x, "title"), ..., subset)## S3 method for class 'mcmc.list'acfplot(x, data, outer = FALSE, groups = !outer, prepanel, panel, type = if (groups) 'b' else 'h', aspect = "xy", start = 1, thin = 1, lag.max = NULL, ylab = "Autocorrelation", xlab = "Lag", main = attr(x, "title"), ..., subset)Arguments
x | an |
data | ignored, present for consistency with generic. |
outer | for the |
groups | for the |
aspect | controls the physical aspect ratio of the panel. See |
default.scales | this parameter provides a reasonable defaultvalue of the |
type | a character vector that determines if lines, points,etc. are drawn on the panel. The default values for the methods arecarefully chosen. See |
thin | an optional thinning interval that is applied before theplot is drawn. |
start | an optional value for the starting point within theseries. Values before the starting point are considered part of the"burn-in" of the series and dropped. |
plot.points | character argument giving the style in whichpoints are added to the plot. See |
layout | a method-specific default for the |
xlab,ylab,main | Used to provide default axis annotations andplot labels. |
cuts,at | defines number and location of values where colorschange |
col.regions | color palette used |
lag.max | maximum lag for which autocorrelation is computed. Bydefault, the value chosen by |
prepanel,panel | suitable prepanel and panel functions for |
... | other arguments, passed to the lattice function.Documentation of the corresponding generics in the |
subset | indices of the subset of the series to plot. Thedefault is constructed from the |
Value
An object of class"trellis". The relevantupdate method can be used toupdate components of the object and theprint method (usually called bydefault) will plot it on an appropriate plotting device.
Author(s)
Deepayan SarkarDeepayan.Sarkar@R-project.org
See Also
Lattice for a brief introduction tolattice displays and links to further documentation.
Examples
data(line)## Not run: xyplot(line)xyplot(line[[1]], start = 10)densityplot(line, start = 10)qqmath(line, start = 10)levelplot(line[[2]])acfplot(line, outer = TRUE)## End(Not run)Named dimensions of MCMC objects
Description
varnames() returns the variable names andchanamesreturns the chain names, or NULL if these are not set.
Ifallow.null = FALSE thenNULL values will bereplaced with canonical names.
Usage
varnames(x, allow.null=TRUE)chanames(x, allow.null=TRUE)varnames(x) <- valuechanames(x) <- valueArguments
x | an |
allow.null | Logical argument that determines whether the function may return NULL |
value | A character vector, or NULL |
Value
A character vector , or NULL.
See Also
Time windows for mcmc objects
Description
window.mcmc is a method formcmc objects which isnormally called by the generic functionwindow
In addition to the generic parameters,start andendthe additional parameterthin may be used to thin out theMarkov chain. Setting thin=k selects every kth iteration startingwith the first. Note that the value ofthin isabsolutenot relative. The value supplied given to the parameterthinmust be a multiple ofthin(x).
Values ofstart,end andthin which are inconsistentwithx are ignored, but a warning message is issued.
Usage
## S3 method for class 'mcmc'window(x, start, end, thin, ...)Arguments
x | an mcmc object |
start | the first iteration of interest |
end | the last iteration of interest |
thin | the required interval between successive samples |
... | futher arguments for future methods |