Movatterモバイル変換

bcf 2.0.2

CRAN fixes

Noah Greifer updated thepackage source to reflecttwochanges to the CRAN checks that resulted inbcf beingremoved from CRAN in April 2023. Noah’s updates:

Removedsprintf() from the C++ source code, as it isnow deprecated, and
RemovedCXX_STD = CXX11 fromsrc/Makevarsandsrc/Makevars.win, as C++11 is now a CRAN default.

Serialization andperformance updates

The prediction method introduced in the previousbcfversion writes tree samples to text files, which can grow large if manysamples are retained. Users concerned about the size of text fileoutputs may suppress writing to text files by specifyingno_output = TRUE in the call tobcf().

Sampling employs within-chain parallelism throughRcppParallel, butbcf does not, for the timebeing, run multiple chains in parallel through R’s high leveldoParallel interface.

bcf 2.0.1

This implementation extends existingbcf functionalityby:

allowing for heteroskedastic errors
automating multi-chain implementations
providing a suite of convergence diagnostic functions via thecoda package
accelerating some underlying computations, resulting in shorterruntimes
providing a function to predict treatment effects based on anexisting model using new data

Weights

The original version ofbcf does not allow for weights,which are often used in practical applications to account forheteroskedasticity. Where the original BCF model was specified as:

y_i ∼ N(μ(x_i) + τ(x_i) z_i,σ²),

which assumes that all outcomes y_i have the same varianceσ², in the extended version we can relax this assumption toallow for heteroskedasticity in y_i:

y_i ∼ N(μ(x_i) + τ(x_i) z_i,σ²/w_i)

Incorporating weights impacts several parts of the code, includingthe computation of:

sufficient statistics
leaf node means
leaf node means variance
error variance (sigma)

Automating multichainprocessing

In Bayesian analysis, it is useful to produce different runs of thesame model – with different starting values – as a way of assessingconvergence. If the different runs produce drastically differentposterior distributions, it is a sign that the model has not convergedfully. In this version ofbcf we have automated multichainprocessing and incorporated key MCMC diagnostics from thecoda package, including effective sample sizes and theGelman-Rubin statistic (“R hat”).

Within-chain parallelism

Finally, our implementation conducts some steps of the samplingprocedure in parallel to maximize computational efficiency. Our testingshows that these enhancements have reduced runtimes by around 50%,across various experimental conditions.

Implementing a predictionmethod

It is now possible to predict the treatment effect for a new set ofunits. Once users have produced a satisfactorybcf run(using training data), they can use this fittedbcf objectto predict on a new set of test data. This is possible even with runsthat have multiple chains.

[8]ページ先頭