Movatterモバイル変換

Spatial GLMs with glmmfields

Sean C. Anderson and Eric J. Ward

2023-10-20

Here we will use the glmmfields package to fit a spatial GLM with apredictor. While glmmfields was designed to fit spatiotemporal GLMs withthe possibility of extreme events, it can also be used to fit regularspatial GLMs without a time element and without extreme events.Currently it can fit Gaussian (link = identity), Gamma (link = log),Poisson (link = log), negative binomial (link = log), and binomial (link= logit) models. The package can also fit lognormal (link = log)models.

Let’s load the necessary packages:

library(glmmfields)library(ggplot2)library(dplyr)

Set up parallel processing (not used in this example):

options(mc.cores = parallel::detectCores())

First, let’s simulate some data. We will use the built-in functionsim_glmmfields(), but normally you would start with yourown data. We will simulate 200 data points, some (fake) temperaturedata, an underlying random field spatial pattern, and add someobservation error. In this example we will fit a Gamma GLM with a loglink.

The underlying intercept is 0.5 and the slope between temperature andour observed variable (say biomass or density of individuals in aquadrat) is 0.2.

set.seed(1)N<-200# number of data pointstemperature<-rnorm(N,0,1)# simulated temperature dataX<-cbind(1, temperature)# our design matrixs<-sim_glmmfields(n_draws =1,gp_theta =1.5,n_data_points = N,gp_sigma =0.2,sd_obs =0.2,n_knots =12,obs_error ="gamma",covariance ="squared-exponential",X = X,B =c(0.5,0.2)# B represents our intercept and slope)d<- s$datd$temperature<- temperatureggplot(s$dat,aes(lon, lat,colour = y))+  viridis::scale_colour_viridis()+geom_point(size =3)

If we fit a regular GLM we can see that there is spatialautocorrelation in the residuals:

m_glm<-glm(y~ temperature,data = d,family =Gamma(link ="log"))m_glm

## ## Call:  glm(formula = y ~ temperature, family = Gamma(link = "log"), ##     data = d)## ## Coefficients:## (Intercept)  temperature  ##      0.5369       0.1967  ## ## Degrees of Freedom: 199 Total (i.e. Null);  198 Residual## Null Deviance:       23.27 ## Residual Deviance: 16.72     AIC: 280.9

confint(m_glm)

##                 2.5 %    97.5 %## (Intercept) 0.4964271 0.5780115## temperature 0.1522553 0.2413277

d$m_glm_residuals<-residuals(m_glm)ggplot(d,aes(lon, lat,colour = m_glm_residuals))+scale_color_gradient2()+geom_point(size =3)

Let’s instead fit a spatial GLM with random fields. Note that we areonly using 1 chain and 500 iterations here so the vignette buildsquickly on CRAN. For final inference, you should likely use 4 or morechains and 2000 or more iterations.

m_spatial<-glmmfields(y~ temperature,data = d,family =Gamma(link ="log"),lat ="lat",lon ="lon",nknots =12,iter =500,chains =1,prior_intercept =student_t(3,0,10),prior_beta =student_t(3,0,3),prior_sigma =half_t(3,0,3),prior_gp_theta =half_t(3,0,10),prior_gp_sigma =half_t(3,0,3),seed =123# passed to rstan::sampling())

## Warning: The largest R-hat is 1.05, indicating chains have not mixed.## Running the chains for more iterations may help. See## https://mc-stan.org/misc/warnings.html#r-hat

## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.## Running the chains for more iterations may help. See## https://mc-stan.org/misc/warnings.html#bulk-ess

## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.## Running the chains for more iterations may help. See## https://mc-stan.org/misc/warnings.html#tail-ess

Let’s look at the model output:

m_spatial

## Inference for Stan model: glmmfields.## 1 chains, each with iter=500; warmup=250; thin=1; ## post-warmup draws per chain=250, total post-warmup draws=250.## ##            mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat## gp_sigma   0.27    0.01 0.07   0.16   0.22   0.26   0.31   0.42   101 1.01## gp_theta   1.69    0.02 0.22   1.34   1.54   1.66   1.83   2.21   155 1.01## B[1]       0.48    0.01 0.07   0.35   0.44   0.48   0.52   0.61    79 1.00## B[2]       0.19    0.00 0.02   0.16   0.18   0.19   0.20   0.22   170 1.02## CV[1]      0.21    0.00 0.01   0.19   0.20   0.21   0.21   0.23   123 1.01## lp__     -62.21    0.23 2.74 -68.49 -63.99 -61.92 -60.12 -58.12   144 1.00## ## Samples were drawn using NUTS(diag_e) at Fri Oct 20 10:26:42 2023.## For each parameter, n_eff is a crude measure of effective sample size,## and Rhat is the potential scale reduction factor on split chains (at ## convergence, Rhat=1).

We can see that the 95% credible intervals are considerably wider onthe intercept term and narrower on the slope coefficient in the spatialGLM vs. the model that ignored the spatial autocorrelation.

Let’s look at the residuals in space this time:

plot(m_spatial,type ="spatial-residual",link =TRUE)+geom_point(size =3)

That looks better.

We can inspect the residuals versus fitted values:

plot(m_spatial,type ="residual-vs-fitted")

And the predictions from our model itself:

plot(m_spatial,type ="prediction",link =FALSE)+  viridis::scale_colour_viridis()+geom_point(size =3)

Compare that to our data at the top. Note that the original data alsoincludes observation error with a CV of 0.2.

We can also extract the predictions themselves with credibleintervals:

# link scale:p<-predict(m_spatial)head(p)

## # A tibble: 6 × 3##   estimate conf_low conf_high##      <dbl>    <dbl>     <dbl>## 1    0.723   0.635      0.807## 2    0.570   0.457      0.667## 3    0.182   0.0586     0.329## 4    0.927   0.792      1.05 ## 5    0.619   0.516      0.722## 6    0.495   0.388      0.614

# response scale:p<-predict(m_spatial,type ="response")head(p)

## # A tibble: 6 × 3##   estimate conf_low conf_high##      <dbl>    <dbl>     <dbl>## 1     2.06     1.89      2.24## 2     1.77     1.58      1.95## 3     1.20     1.06      1.39## 4     2.53     2.21      2.85## 5     1.86     1.68      2.06## 6     1.64     1.47      1.85

# prediction intervals on new observations (include observation error):p<-predict(m_spatial,type ="response",interval ="prediction")head(p)

## # A tibble: 6 × 3##   estimate conf_low conf_high##      <dbl>    <dbl>     <dbl>## 1     2.06    1.39       2.96## 2     1.77    0.983      2.62## 3     1.20    0.755      1.77## 4     2.53    1.51       3.73## 5     1.86    1.18       2.63## 6     1.64    0.975      2.46

Or use thetidy method to get our parameter estimates asa data frame:

head(tidy(m_spatial,conf.int =TRUE,conf.method ="HPDinterval"))

## # A tibble: 6 × 5##   term                     estimate std.error conf.low conf.high##   <chr>                       <dbl>     <dbl>    <dbl>     <dbl>## 1 gp_sigma                    0.261    0.0733    0.155     0.416## 2 gp_theta                    1.66     0.221     1.36      2.21 ## 3 B[1]                        0.479    0.0653    0.336     0.614## 4 B[2]                        0.193    0.0153    0.160     0.219## 5 CV[1]                       0.206    0.0116    0.183     0.227## 6 spatialEffectsKnots[1,1]    0.333    0.0809    0.143     0.478

Or make the predictions on a fine-scale spatial grid for a constantvalue of the predictor:

pred_grid<-expand.grid(lat =seq(min(d$lat),max(d$lat),length.out =30),lon =seq(min(d$lon),max(d$lon),length.out =30))pred_grid$temperature<-mean(d$temperature)pred_grid$prediction<-predict(  m_spatial,newdata = pred_grid,type ="response")$estimateggplot(pred_grid,aes(lon, lat,fill = prediction))+geom_raster()+  viridis::scale_fill_viridis()

[8]ページ先頭