Bayesian infinite factor modelling
This package was developed for modular construction of Bayesianfactor samplers using a variety of data models and priors. Bayesianfactor models are key tools for performing reproducible and robustdimension reduction and modeling complex relationships among sets ofregressors. Accounting for structure with factor models allows bothefficiencies in estimation, and inference on that structure. Popularshrinkage priors (MGSP, Dirichlet-Laplace, others) have been formulatedfor the factor loadings matrix that do not require prespecifiedcovariate grouping or order-dependence. This package aims to implementthose priors for use with multiple data models in an easy,straightforward, and fast environment for application, as well asencourage further model and prior development through use of modular andexchangeable parameter updates.
Theaugment sampler input argument can be used to modifysampler behavior for non-standard applications. This argument takes an Rexpression to be evaluated every sampler iteration after the standardparameter updates and before samples are stored. A straightforwardapplication of this argument is to change a sampling parameter from asingle value to a random variable. This can be done to place furtherhyperpriors on shrinkage parameters, or to perform data augmentation.Refer to the sample source code for parameter names.
A common form of data augmentation is imputation due to missingness.In the included factor models we have simple representations of theposterior predictive distributions of the factorized data matrix X andthe interaction response y. Here we provide example code to samplemissing entries of X under a Missing at Random (MAR) assumption.
# for data matrix "data" with NA missing valuescompleteX=function(X, Xmiss, lambda, eta, ps){ noise=t(replicate(nrow(X),rnorm(ncol(X),0,sqrt(1/ps)))) X[is.na(Xmiss)]= (tcrossprod(eta, lambda)+noise)[is.na(Xmiss)]return(X)}missing=expression({X=completeX(X, data, lambda, eta, ps)})X= dataX[is.na(X)]=rnorm(sum(is.na(X)))sample=linearMGSP(X,10000,5000,augment = missing)Similarly we can define an augmentation for missingness due to a(lower) limit of detection. Here we allow a different LoD for everyelement of X, corresponding to test and batch LoD variablility, andthose limits are encoded in the matrixLOD wheredim(LOD)=dim(X). In this case the appropriateposterior predictive is the same as above, but truncated at the limit ofdetection. We use thetruncnorm package for truncatednormal sampling.
# for data matrix "data" with NA missing valueslodX=function(X, Xmiss, lambda, eta, ps){ vars=matrix(1/ps,ncol(X),nrow(X),byrow=T)[is.na(Xmiss)] nmiss=sum(is.na(Xmiss)) means=tcrossprod(eta, lambda)[is.na(Xmiss)] X[is.na(Xmiss)]=rtruncnorm(nmiss,b = LOD[is.na(X)], means,sqrt(vars))return(X)}missing=expression({X=lodX(X, data, lambda, eta, ps)})X= dataX[is.na(X)]= LOD[is.na(X)]sample=linearMGSP(X,10000,5000,augment = lodX)This code can be easily altered for upper limits of detection (seertruncnorm documentation).