See the NEWS file for recent updates, and below for quickstart!
ctsem allows for easy specification and fitting of a range ofcontinuous and discrete time dynamic models, including multipleindicators (dynamic factor analysis), multiple, potentially higher orderprocesses, and time dependent (varying within subject) and timeindependent (not varying within subject) covariates. Classiclongitudinal models like latent growth curves and latent change scoremodels are also possible. Version 1 of ctsem provided SEM basedfunctionality by linking to the OpenMx software, allowing mixed effectsmodels (random means but fixed regression and variance parameters) formultiple subjects. For version 2 of the R package ctsem, we include ahierarchical specification and fitting routine that uses the Stanprobabilistic programming language, via the rstan package in R. Thisallows for all parameters of the dynamic model to individually vary,using an estimated population mean and variance, and any timeindependent covariate effects, as a prior. Version 3 allows for statedependencies in the parameter specification (i.e. time varyingparameters).
The current manual is athttps://cran.r-project.org/package=ctsem/vignettes/hierarchicalmanual.pdf.The original ctsem is documented in a JSS publication (Driver, Voelkle,Oud, 2017), and in R vignette form athttps://cran.r-project.org/package=ctsemOMX/vignettes/ctsem.pdf,however these OpenMx based functions have been split off into a subpackage, ctsemOMX. For most use cases the newer formulation (with Kalmanfiltering coded in Stan) is faster, more robust, and more flexible, andboth default to maximum likelihood. For cases with many subjects, fewtime points, and no individual differences in timing, ctsemOMX may befaster.
For questions (or to see past answers) please usehttps://github.com/cdriveraus/ctsem/discussions
For some tutorials and another quick start, see . Theveryquick start is below.
To cite ctsem please use the citation(“ctsem”) command in R.
remotes::install_github('cdriveraus/ctsem',INSTALL_opts ="--no-multiarch",dependencies =c("Depends","Imports"))install.packages('ctsem')Ensure recent version of R and Rtools is installed. If theinstallctsem.R code has never been run before, be sure to run that (seeabove).
Place this line in ~/.R/makevars.win , and if there are other lines,delete them:
CXX17FLAGS += -mtune=native -Wno-ignored-attributes -Wno-deprecated-declarationsFor compile issues, check if you can use rstan, check forum postson
In case of compile errors likeg++ not found, ensure thedevtools package is installed:
install.packages('devtools')#’ The basic long data structure. Diet, (our covariate) is acategorical variable so needs dummy / ‘one hot’ encoding.
head(ChickWeight)#’ Setup dummy coding
library(data.table)library(mltools)chickdata<-one_hot(as.data.table(ChickWeight),cols ='Diet')#’ Scaling of continuous variables makes for easier estimation andmore sensible default priors (if used). Time intervals can alsobenefit
chickdata$weight<-scale(chickdata$weight)head(chickdata)#now we have the four diet categories#’ Setup continuous time model – in this case we are estimating aregular first order autoregressive
library(ctsem)m<-ctModel(LAMBDA=diag(1),#Factor loading matrix of latent processes on measurements, fixed to 1type ='ct',#Could specify 'dt' here for discrete time.tipredDefault =FALSE,#limit covariate effects on parameters to those explicitly specifiedmanifestNames='weight',#Observed measurements of the latent processeslatentNames='Lweight',#Names here simply make parameters and plots more interpretableTIpredNames =paste0('Diet_',2:4),#Covariates, in this case one category needs to be baseline...DRIFT='a11 | param',#normally self feedback (diagonal drift terms) are restricted to negativeMANIFESTMEANS=0,#For identification CINT is normally zero with this freely estimatedCINT='cint ||||Diet_2,Diet_3,Diet_4',#diet covariates specified in 5th 'slot' (four '|' separators)time='Time',id='Chick')#’ View model in pdf/ latex form
ctModelLatex(m)#’ Fit model to data – here using priors because Hessian problems arereported otherwise
f<-ctStanFit(chickdata,m,priors=TRUE)#’ Summarise fit, view covariate effects – Diets 3 and 4 seem mostobviously successful
s=summary(f)print(s$tipreds )#’ Predictions conditional on all earlier data
ctKalman(f,plot=TRUE,subjects=2:4,kalmanvec=c('yprior','ysmooth'))#’ Predictions conditional only on covariates, showing 1 chick fromeach diet
ctKalman(f,plot=T,subjects=as.numeric(chickdata$Chick[!duplicated(ChickWeight$Diet)]),removeObs = T,polygonalpha=0)#’ Plot temporal regression coefficients conditional on time interval– increases in this case!
ctStanDiscretePars(f,plot=T)#’ Other useful functions:
#’ Compare two fits: ctChisqTest()
#’ Fit and summarise / plot a list of models: ctFitMultiModel()
#’ Add samples to fit to increase estimate precision:ctAddSamples()
#’ Return dynamic system parameters in matrix forms:ctStanContinuousPars()
#’ Compute cross validation statistics: ctLOO()
#’ Plot time independent predictor (covariate effects on parameters):ctStanTIpredEffects()
#’ Generate data from a specified model of fixed parameters:ctGenerate()
#’ Generate data from a specified model of fixed and free parameters/ priors: ctStanGenerate()
#’ Generate data from a fitted model: ctStanGenerateFromFit()
#’ Get samples from the fitted object: ctExtract()
#’ In samples, pop_DRIFT refers to the population drift matrix,subj_DRIFT refers to the subject matrix. Subject matrices only computedfor max likelihood / posterior mode by default, and found in the