Specify autoregressive dynamic processes inmvgam

Source:R/mvgam_trend_types.R

RW.Rd

Set up autoregressive or autoregressive moving average trend models inmvgam. These functions do not evaluate their arguments – they existpurely to help set up a model with particular autoregressive trend models.

Usage

RW(ma=FALSE, cor=FALSE, gr=NA, subgr=NA)AR(p=1, ma=FALSE, cor=FALSE, gr=NA, subgr=NA)CAR(p=1)VAR(ma=FALSE, cor=FALSE, gr=NA, subgr=NA)

Arguments

ma

Logical. Include moving average terms of order1?Default isFALSE.

cor

Logical. Include correlated process errors as part of amultivariate normal process model? IfTRUE and ifn_series > 1 in the supplied data, a fully structured covariancematrix will be estimated for the process errors. Default isFALSE.

gr

An optional grouping variable, which must be afactor in thesupplieddata, for setting up hierarchical residual correlationstructures. If specified, this will automatically setcor = TRUE and setup a model where the residual correlations for a specific level ofgrare modelled hierarchically:

\(\Omega_{group} = \alpha_{cor}\Omega_{global} + (1 - \alpha_{cor})\Omega_{group, local}\),

where \(\Omega_{global}\) is aglobal correlation matrix,\(\Omega_{group, local}\) is alocal deviation correlation matrix and\(\alpha_{cor}\) is a weighting parameter controlling how strongly thelocal correlation matrix \(\Omega_{group}\) is shrunk towards the globalcorrelation matrix \(\Omega_{global}\) (larger values of\(\alpha_{cor}\) indicate a greater degree of shrinkage, i.e. a greaterdegree of partial pooling).

When used within aVAR() model, this essentially sets up a hierarchicalpanel vector autoregression where both the autoregressive and correlationmatrices are learned hierarchically. Ifgr is supplied thensubgrmust also be supplied.

subgr

A subgroupingfactor variable specifying which element indata represents the different time series. Defaults toseries, butnote that models that use the hierarchical correlations, where thesubgr time series are measured in each level ofgr,should notinclude aseries element indata. Rather, this element will be createdinternally based on the supplied variables forgr andsubgr.

For example, if you are modelling temporal counts for a group of species(labelled asspecies indata) across three different geographicalregions (labelled asregion), and you would like the residuals to becorrelated within regions, then you should specifygr = region andsubgr = species. Internally,mvgam() will create theseries elementfor the data using:

series = interaction(group, subgroup, drop = TRUE)

p

A non-negative integer specifying the autoregressive (AR) order.Default is1. Cannot currently be larger than3 forARterms, and cannot be anything other than1 for continuous time AR(CAR) terms.

Value

An object of classmvgam_trend, which contains a list ofarguments to be interpreted by the parsing functions inmvgam.

Details

Usevignette("mvgam_overview") to see the full details ofavailable stochastic trend types inmvgam, or view the renderedversion on the package website at:https://nicholasjclark.github.io/mvgam/articles/mvgam_overview.html

Author

Nicholas J Clark

Examples

# \donttest{# A short example to illustrate CAR(1) models# Function to simulate CAR1 data with seasonalitysim_corcar1=function(n=125,phi=0.5,sigma=2,sigma_obs=0.75){# Sample irregularly spaced time intervalstime_dis<-c(1,runif(n-1,0,5))# Set up the latent dynamic processx<-vector(length=n);x[1]<--0.3for(iin2:n){# zero-distances will cause problems in sampling, so mvgam uses a# minimum threshold; this simulation function emulates that processif(time_dis[i]==0){x[i]<-rnorm(1,        mean=(phi^1e-3)*x[i-1],        sd=sigma*(1-phi^(2*1e-3))/(1-phi^2))}else{x[i]<-rnorm(1,        mean=(phi^time_dis[i])*x[i-1],        sd=sigma*(1-phi^(2*time_dis[i]))/(1-phi^2))}}# Add 12-month seasonalitycov1<-sin(2*pi*(1:n)/12)cov2<-cos(2*pi*(1:n)/12)beta1<-runif(1,0.3,0.7)beta2<-runif(1,0.2,0.5)seasonality<-beta1*cov1+beta2*cov2# Take Gaussian observations with error and returndata.frame(    y=rnorm(n, mean=x+seasonality, sd=sigma_obs),    season=rep(1:12,20)[1:n],    time=cumsum(time_dis))}# Sample two time seriesdat<-rbind(dplyr::bind_cols(sim_corcar1(phi=0.65, sigma_obs=0.55),data.frame(series='series1')),dplyr::bind_cols(sim_corcar1(phi=0.8, sigma_obs=0.35),data.frame(series='series2')))%>%dplyr::mutate(series=as.factor(series))# mvgam with CAR(1) trends and series-level seasonal smoothsmod<-mvgam(  formula=y~-1,  trend_formula=~s(season, bs='cc', k=5, by=trend),  trend_model=CAR(),  priors=c(prior(exponential(3), class=sigma),prior(beta(4,4), class=sigma_obs)),  data=dat,  family=gaussian(),  chains=2,  silent=2)# View usual summaries and plotssummary(mod)#> GAM observation formula:#> y ~ 1#> <environment: 0x561bfe1ad8c0>#>#> GAM process formula:#> ~s(season, bs = "cc", k = 5, by = trend)#> <environment: 0x561bfe1ad8c0>#>#> Family:#> gaussian#>#> Link function:#> identity#>#> Trend model:#> CAR()#>#> N process models:#> 2#>#> N series:#> 2#>#> N timepoints:#> 125#>#> Status:#> Fitted using Stan#> 2 chains, each with iter = 1000; warmup = 500; thin = 1#> Total post-warmup draws = 1000#>#> Observation error parameter estimates:#>              2.5%  50% 97.5% Rhat n_eff#> sigma_obs[1] 0.11 0.48  0.83 1.03    18#> sigma_obs[2] 0.18 0.49  0.80 1.05    29#>#> GAM observation model coefficient (beta) estimates:#>             2.5% 50% 97.5% Rhat n_eff#> (Intercept)    0   0     0  NaN   NaN#>#> standard deviation:#>          2.5% 50% 97.5% Rhat n_eff#> sigma[1]  2.4 2.7   3.1    1   659#> sigma[2]  3.4 3.8   4.3    1  1526#>#> GAM process model coefficient (beta) estimates:#>                                2.5%   50% 97.5% Rhat n_eff#> (Intercept)_trend             -0.64  0.14  0.77    1  1245#> s(season):trendtrend1.1_trend -0.39  0.21  0.99    1   717#> s(season):trendtrend1.2_trend -1.30 -0.24  0.44    1   701#> s(season):trendtrend1.3_trend -1.30 -0.38  0.22    1   505#> s(season):trendtrend2.1_trend -0.78  0.02  0.85    1   971#> s(season):trendtrend2.2_trend -1.60 -0.19  0.63    1   716#> s(season):trendtrend2.3_trend -1.60 -0.19  0.44    1   503#>#> Approximate significance of GAM process smooths:#>                          edf Ref.df Chi.sq p-value#> s(season):seriestrend1 1.893      3  8.753   0.651#> s(season):seriestrend2 1.697      3  3.044   0.950#>#> Stan MCMC diagnostics:#> ✔ No issues with effective samples per iteration#> ✔ Rhat looks good for all parameters#> ✔ No issues with divergences#> ✔ No issues with maximum tree depth#>#> Samples were drawn using sampling(hmc). For each parameter, n_eff is a#>   crude measure of effective sample size, and Rhat is the potential scale#>   reduction factor on split MCMC chains (at convergence, Rhat = 1)#>#> Use how_to_cite() to get started describing this modelconditional_effects(mod, type='expected')plot(mod, type='trend', series=1)plot(mod, type='trend', series=2)plot(mod, type='residuals', series=1)plot(mod, type='residuals', series=2)mcmc_plot(mod,  variable='ar1',  regex=TRUE,  type='hist')#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.# Now an example illustrating hierarchical dynamicsset.seed(123)# Simulate three species monitored in three different regionssimdat1<-sim_mvgam(  trend_model=VAR(cor=TRUE),  prop_trend=0.95,  n_series=3,  mu=c(1,2,3))simdat2<-sim_mvgam(  trend_model=VAR(cor=TRUE),  prop_trend=0.95,  n_series=3,  mu=c(1,2,3))simdat3<-sim_mvgam(  trend_model=VAR(cor=TRUE),  prop_trend=0.95,  n_series=3,  mu=c(1,2,3))# Set up the data but DO NOT include 'series'all_dat<-rbind(simdat1$data_train%>%dplyr::mutate(region='qld'),simdat2$data_train%>%dplyr::mutate(region='nsw'),simdat3$data_train%>%dplyr::mutate(region='vic'))%>%dplyr::mutate(    species=gsub('series','species',series),    species=as.factor(species),    region=as.factor(region))%>%dplyr::arrange(series,time)%>%dplyr::select(-series)# Check priors for a hierarchical AR1 modelget_mvgam_priors(  formula=y~species,  trend_model=AR(gr=region, subgr=species),  data=all_dat)#>                                param_name param_length#> 1                             (Intercept)            1#> 2                        speciesspecies_2            1#> 3                        speciesspecies_3            1#> 4 vector<lower=-1,upper=1>[n_series] ar1;            9#> 5        vector<lower=0>[n_series] sigma;            9#>                      param_info                                  prior#> 1                   (Intercept)  (Intercept) ~ student_t(3, 1.9, 2.5);#> 2 speciesspecies_2 fixed effect speciesspecies_2 ~ student_t(3, 0, 2);#> 3 speciesspecies_3 fixed effect speciesspecies_3 ~ student_t(3, 0, 2);#> 4         trend AR1 coefficient                    ar1 ~ std_normal();#> 5                      trend sd       sigma ~ inv_gamma(1.418, 0.452);#>                     example_change new_lowerbound new_upperbound#> 1      (Intercept) ~ normal(0, 1);             NA             NA#> 2 speciesspecies_2 ~ normal(0, 1);             NA             NA#> 3 speciesspecies_3 ~ normal(0, 1);             NA             NA#> 4       ar1 ~ normal(-0.79, 0.86);             NA             NA#> 5       sigma ~ exponential(0.37);             NA             NA# Fit the modelmod<-mvgam(  formula=y~species,  trend_model=AR(gr=region, subgr=species),  data=all_dat,  chains=2,  silent=2)# Check standard outputssummary(mod)#> GAM formula:#> y ~ species#> <environment: 0x561bfe1ad8c0>#>#> Family:#> poisson#>#> Link function:#> log#>#> Trend model:#> AR(gr = region, subgr = species)#>#> N series:#> 9#>#> N timepoints:#> 75#>#> Status:#> Fitted using Stan#> 2 chains, each with iter = 1000; warmup = 500; thin = 1#> Total post-warmup draws = 1000#>#> GAM coefficient (beta) estimates:#>                  2.5%  50% 97.5% Rhat n_eff#> (Intercept)      0.92 1.10   1.3 1.03   174#> speciesspecies_2 0.80 0.98   1.2 1.01   409#> speciesspecies_3 1.60 1.90   2.1 1.05    70#>#> standard deviation:#>          2.5%  50% 97.5% Rhat n_eff#> sigma[1] 0.78 0.97  1.20 1.00   574#> sigma[2] 0.64 0.78  0.95 1.01   569#> sigma[3] 0.80 0.95  1.20 1.00   571#> sigma[4] 0.29 0.45  0.68 1.02   110#> sigma[5] 0.61 0.73  0.89 1.00   671#> sigma[6] 0.66 0.78  0.94 1.00   616#> sigma[7] 0.59 0.77  0.99 1.00   357#> sigma[8] 0.54 0.68  0.88 1.00   448#> sigma[9] 0.68 0.82  0.99 1.00   720#>#> precision parameter:#>        2.5% 50% 97.5% Rhat n_eff#> tau[1] 0.65 1.1   1.6 1.00   571#> tau[2] 1.10 1.6   2.4 1.00   574#> tau[3] 0.75 1.1   1.5 1.00   544#> tau[4] 2.20 4.9  12.0 1.02   143#> tau[5] 1.30 1.9   2.7 1.00   698#> tau[6] 1.10 1.6   2.3 1.00   641#> tau[7] 1.00 1.7   2.9 1.00   379#> tau[8] 1.30 2.2   3.4 1.00   489#> tau[9] 1.00 1.5   2.2 1.00   770#>#> autoregressive coef 1:#>           2.5%   50%  97.5% Rhat n_eff#> ar1[1]  0.2500  0.49 0.7100 1.02   129#> ar1[2]  0.1300  0.34 0.5500 1.01   160#> ar1[3]  0.0460  0.36 0.6500 1.04    45#> ar1[4]  0.3000  0.65 0.8700 1.01   158#> ar1[5] -0.1400  0.09 0.3000 1.00   153#> ar1[6] -0.4300 -0.21 0.0086 1.00    97#> ar1[7] -0.0015  0.31 0.6200 1.00   237#> ar1[8]  0.5400  0.74 0.9100 1.00   129#> ar1[9]  0.3200  0.54 0.7700 1.01   102#>#> Hierarchical correlation weighting parameter (alpha_cor) estimates:#>            2.5%   50% 97.5% Rhat n_eff#> alpha_cor 0.014 0.074  0.21    1   332#>#> Stan MCMC diagnostics:#> ✔ No issues with effective samples per iteration#> ✔ Rhat looks good for all parameters#> ✔ No issues with divergences#> ✔ No issues with maximum tree depth#>#> Samples were drawn using sampling(hmc). For each parameter, n_eff is a#>   crude measure of effective sample size, and Rhat is the potential scale#>   reduction factor on split MCMC chains (at convergence, Rhat = 1)#>#> Use how_to_cite() to get started describing this model# Inspect posterior estimates for the correlation weighting parametermcmc_plot(mod, variable='alpha_cor', type='hist')#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.# }