| Type: | Package |
| Title: | Bayesian Consensus Clustering for Multiple Longitudinal Features |
| Version: | 1.0.3 |
| Maintainer: | Zhiwen Tan <21zt9@queensu.ca> |
| Description: | It is very common nowadays for a study to collect multiple features and appropriately integrating multiple longitudinal features simultaneously for defining individual clusters becomes increasingly crucial to understanding population heterogeneity and predicting future outcomes. 'BCClong' implements a Bayesian consensus clustering (BCC) model for multiple longitudinal features via a generalized linear mixed model. Compared to existing packages, several key features make the 'BCClong' package appealing: (a) it allows simultaneous clustering of mixed-type (e.g., continuous, discrete and categorical) longitudinal features, (b) it allows each longitudinal feature to be collected from different sources with measurements taken at distinct sets of time points (known as irregularly sampled longitudinal data), (c) it relaxes the assumption that all features have the same clustering structure by estimating the feature-specific (local) clusterings and consensus (global) clustering. |
| License: | MIT + file LICENSE |
| Depends: | R (≥ 3.5.0) |
| Imports: | cluster, coda, ggplot2, graphics, label.switching,LaplacesDemon, lme4, MASS, mclust, MCMCpack, mixAK, mvtnorm,nnet, Rcpp (≥ 1.0.9), Rmpfr, stats, truncdist, abind,gridExtra |
| Suggests: | cowplot, joineRML, knitr, rmarkdown, survival, survminer,testthat (≥ 3.0.0) |
| LinkingTo: | Rcpp, RcppArmadillo |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.1 |
| NeedsCompilation: | yes |
| Packaged: | 2024-06-23 23:47:10 UTC; whytz |
| Author: | Zhiwen Tan [aut, cre], Zihang Lu [ctb], Chang Shen [ctb] |
| Repository: | CRAN |
| Date/Publication: | 2024-06-24 00:00:02 UTC |
Compute a Bayesian Consensus Clustering model for mixed-type longitudinal data
Description
This function performs clustering on mixed-type (continuous, discrete andcategorical) longitudinal markers using Bayesian consensus clustering methodwith MCMC sampling
Usage
BCC.multi( mydat, id, time, center = 1, num.cluster, formula, dist, alpha.common = 0, initials = NULL, sigma.sq.e.common = 1, hyper.par = list(delta = 1, a.star = 1, b.star = 1, aa0 = 0.001, bb0 = 0.001, cc0 = 0.001, ww0 = 0, vv0 = 1000, dd0 = 0.001, rr0 = 4, RR0 = 3), c.ga.tunning = NULL, c.theta.tunning = NULL, adaptive.tunning = 0, tunning.freq = 20, initial.cluster.membership = "random", input.initial.local.cluster.membership = NULL, input.initial.global.cluster.membership = NULL, seed.initial = 2080, burn.in, thin, per, max.iter)Arguments
mydat | list of R longitudinal features (i.e., with a length of R),where R is the number of features. The data should be preparedin a long-format (each row is one time point per individual). |
id | a list (with a length of R) of vectors of the study id ofindividuals for each feature. Single value (i.e., a length of 1)is recycled if necessary |
time | a list (with a length of R) of vectors of time (or age) at whichthe feature measurements are recorded |
center | 1: center the time variable before clustering, 0: no centering |
num.cluster | number of clusters K |
formula | a list (with a length of R) of formula for each feature.Each formula is a twosided linear formula object describingboth the fixed-effects and random effects part of the model,with the response (i.e., longitudinal feature) on the leftof a ~ operator and the terms, separated by + operations,or the right. Random-effects terms are distinguished byvertical bars (|) separating expressions for design matricesfrom grouping factors.See formula argument from the lme4 package |
dist | a character vector (with a length of R) that determines thedistribution for each feature. Possible values are "gaussian"for a continuous feature, "poisson" for a discrete feature(e.g., count data) using a log link and "binomial" for adichotomous feature (0/1) using a logit link. Single value(i.e., a length of 1) is recycled if necessary |
alpha.common | 1 - common alpha, 0 - separate alphas for each outcome |
initials | List of initials for: zz, zz.local ga, sigma.sq.u, sigma.sq.e,Default is NULL |
sigma.sq.e.common | 1 - estimate common residual variance across all groups,0 - estimate distinct residual variance, default is 1 |
hyper.par | hyper-parameters of the prior distributions for the modelparameters. The default hyper-parameters values will resultin weakly informative prior distributions. |
c.ga.tunning | tuning parameter for MH algorithm (fixed effect parameters),each parameter corresponds to an outcome/marker, defaultvalue equals NULL |
c.theta.tunning | tuning parameter for MH algorithm (random effect),each parameter corresponds to an outcome/marker,default value equals NULL |
adaptive.tunning | adaptive tuning parameters, 1 - yes, 0 - no,default is 1 |
tunning.freq | tuning frequency, default is 20 |
initial.cluster.membership | "mixAK" or "random" or "PAM" or "input" -input initial cluster membership for localclustering, default is "random" |
input.initial.local.cluster.membership | if use "input",option input.initial.cluster.membershipmust not be empty, default is NULL |
input.initial.global.cluster.membership | input initial clustermembership for global clusteringdefault is NULL |
seed.initial | seed for initial clustering(for initial.cluster.membership = "mixAK")default is 2080 |
burn.in | the number of samples disgarded.This value must be smaller than max.iter. |
thin | the number of thinning. For example, if thin = 10,then the MCMC chain will keep one sample every 10 iterations |
per | specify how often the MCMC chain will print the iteration number |
max.iter | the number of MCMC iterations. |
Value
Returns a BCC class model contains clustering information
Examples
# import dataframedata(epil)# example only, larger number of iteration required for accurate resultfit.BCC <- BCC.multi ( mydat = list(epil$anxiety_scale,epil$depress_scale), dist = c("gaussian"), id = list(epil$id), time = list(epil$time), formula =list(y ~ time + (1|id)), num.cluster = 2, burn.in = 3, thin = 1, per =1, max.iter = 8)Goodness of fit.
Description
This function assess the model goodness of fit by calculate thediscrepancy measure T(bm(y), bm(Theta)) with following steps(a) Generate T.obs based on the MCMC samples(b) Generate T.rep based on the posterior distribution of the parameters(c) Compare T.obs and T.rep, and calculate the P values.
Usage
BayesT(fit)Arguments
fit | an objective output from BCC.multi() function |
Value
Returns a dataframe with length equals to 2 that containsobserved and predict value
Examples
#import datadata(example)fit.BCC <- exampleBayesT(fit.BCC)PBCseqfit model
Description
This model contains the result that run fromBCC.multi function usingPBC910 dataset inmixAK package
Usage
data(PBCseqfit)Format
This is a BCC model with thirty elements
Examples
data(PBCseqfit)PBCseqfitconRes dataset
Description
This data sets contains the result that run fromBayesT function using epil1 BCC object.The epil1 object was obtained usingBCC.multi function
Usage
data(conRes)Format
This is a dataframe with two columns and twenty observations
Examples
data(conRes)conResepil dataset
Description
This is epileptic.qol data set fromjoinrRML
Usage
data(epil)Format
This is a dataframe with 4 varaibles and 1852 observations
Examples
data(epil)epilepil1 model
Description
This model contains the result that run fromBCC.multi function usingepileptic.qol dataset injoinrRML package.This model has formula offormula =list(y ~ time + (1|id))
Usage
data(epil1)Format
This is a BCC model with thirty elements
Examples
data(epil1)epil1epil2 model
Description
This model contains the result that run fromBCC.multi function usingepileptic.qol dataset injoinrRML package.This model has formula offormula =list(y ~ time + (1 + time|id))
Usage
data(epil2)Format
This is a BCC model with thirty elements
Examples
data(epil2)epil2epil3 model
Description
This model contains the result that run fromBCC.multi function usingepileptic.qol dataset injoinrRML package.This model has formula offormula =list(y ~ time + time2 + (1 + time|id))
Usage
data(epil3)Format
This is a BCC model with thirty elements
Examples
data(epil3)epil3example model
Description
This is an example model which contains the result that run fromBCC.multifunction using epileptic.qol dataset injoinrRML package.Only used in documented example and tests. Since small number of iterationswere used, this model can may not represent the true performancefor this method.
Usage
data(example)Format
This is a BCC model with thirty elements
Examples
data(example)exampleexample1 model
Description
This is an example model which contains the result that run fromBCC.multifunction using epileptic.qol dataset injoinrRML package.Only used the tests. Since small number of iterationswere used, this model can may not represent the true performancefor this method.
Usage
data(example1)Format
This is a BCC model with thirty elements
Examples
data(example1)example1Model selection
Description
A function that calculates DIC and WAIC for model selection
Usage
model.selection.criteria(fit, fast_version = TRUE)Arguments
fit | an objective output from BCC.multi() function |
fast_version | if fast_verion=TRUE (default), then compute the DIC and WAIC usingthe first 100 MCMC samples (after burn-in and thinning) . If fast_version=FALSE, thencompute the DIC and WAIC using all MCMC samples (after burn-in and thinning) |
Value
Returns the calculated score
Examples
#import datadata(example1)fit.BCC <- example1res <- model.selection.criteria(fit.BCC, fast_version=TRUE)resGeneric plot method for BCC objects
Description
Generic plot method for BCC objects
Usage
## S3 method for class 'BCC'plot(x, ...)Arguments
x | An object of class BCC. |
... | further arguments passed to or from other methods. |
Value
Void function plot model object, no object return
Examples
# get data from the packagedata(epil1)fit.BCC <- epil1plot(fit.BCC)Generic print method for BCC objects
Description
Generic print method for BCC objects
Usage
## S3 method for class 'BCC'print(x, ...)Arguments
x | An object of class BCC. |
... | further arguments passed to or from other methods. |
Value
Void function prints model information, no object return
Examples
# get data from the packagedata(epil2)fit.BCC <- epil2print(fit.BCC)Generic summary method for BCC objects
Description
Generic summary method for BCC objects
Usage
## S3 method for class 'BCC'summary(object, ...)Arguments
object | An object of class BCC. |
... | further arguments passed to or from other methods. |
Value
Void function summarize model information, no object return
Examples
# get data from the packagedata(epil2)fit.BCC <- epil2summary(fit.BCC)Trace plot function
Description
To visualize the MCMC chain for model parameters
Usage
traceplot( fit, cluster.indx = 1, feature.indx = 1, parameter = "PPI", xlab = NULL, ylab = NULL, ylim = NULL, xlim = NULL, title = NULL)Arguments
fit | an objective output from BCC.multi() function. |
cluster.indx | a numeric value. For cluster-specific parameters,specifying cluster.indx will generate the trace plot forthe corresponding cluster. |
feature.indx | a numeric value. For cluster-specific parameters,specifying feature.indx will generate the traceplot for the corresponding cluster. |
parameter | a character value. Specify which parameter for which thetrace plot will be generated. The value can be "PPI" for pi,alpha for alpha, "GA" for gamma, "SIGMA.SQ.U" for Sigmaand "SIGMA.SQ.E" for sigma. |
xlab | Label for x axis |
ylab | Label for y axis |
ylim | The range for y axis |
xlim | The range for x axis |
title | Title for the trace plot |
Value
void function with no return value, only show plots
Examples
# get data from the packagedata(epil1)fit.BCC <- epil1traceplot(fit=fit.BCC, parameter="PPI",ylab="pi",xlab="MCMC samples")Trajplot for fitted model
Description
plot the longitudinal trajectory of features by local and global clusterings
Usage
trajplot( fit, feature.ind = 1, which.cluster = "global.cluster", title = NULL, ylab = NULL, xlab = NULL, color = NULL)Arguments
fit | an objective output from BCC.multi() function |
feature.ind | a numeric value indicating which feature to plot.The number indicates the order of the feature specifiedin mydat argument of the BCC.multi()() function |
which.cluster | a character value: "global" or "local", indicatingwhether to plot the trajectory by global cluster orlocal cluster indices |
title | Title for the trace plot |
ylab | Label for y axis |
xlab | Label for x axis |
color | Color for the trajplot |
Value
A plot object
Examples
# get data from the packagedata(epil1)fit.BCC <- epil1# for local clustertrajplot(fit=fit.BCC,feature.ind=1, which.cluster = "local.cluster", title= "Local Clustering",xlab="time (months)", ylab="anxiety",color=c("#00BA38", "#619CFF"))# for global clustertrajplot(fit=fit.BCC,feature.ind=1, which.cluster = "global.cluster", title="Global Clustering",xlab="time (months)", ylab="anxiety",color=c("#00BA38", "#619CFF"))