Movatterモバイル変換


[0]ホーム

URL:


Overview of ubms

UnmarkedBayesianModels withStan

Ken Kellner

1 Introduction

1.1 What is ubms?

ubms is an R package for fitting models of wildlifeoccurrence and abundance to with datasets where animals are notindividually marked. It provides a nearly identical interface to thepopularunmarked package(Fiske, Chandler, and others 2011).Instead of using maximum likelihood to fit models (as withunmarked), models are fit in a Bayesian framework usingStan(Carpenter et al. 2017). It isgenerally expected that you are already familiar withunmarked when usingubms. You can downloadubms, report issues, or help with development onGithub.

1.2 Why should you useit?

There are several advantages to usingubms overunmarked. First, it is possible to include random effectsinubms models, which is not currently possible inunmarked. These are specified using the familiar syntax oflme4(Bates et al.2015). Second,ubms generates posteriordistributions for model parameters, including latent occupancy andabundance parameters. These can be useful for post-hoc analyses anddiagnostics. Finally, fitting models with Stan gives you access to thelarge ecosystem of Stan-related tools, such asLOO(leave-one-out-cross validation;(Vehtari, Gelman, and Gabry2017)).

Another alternative toubms would be to fit models in anexisting modeling language such as BUGS,JAGS, or directly in Stan.ubms abstracts away the complex process of defining andwriting custom models in these languages which need to be updated eachtime you make changes to your model. It also provides many useful helperfunctions (e.g. predict) which would otherwise requirecustom code. Finally, because of Stan’s efficient sampler(Hoffman and Gelman2014) and because the underlying likelihoods inubms are marginalized,ubms will often fitmodels much faster than equivalent models in BUGS or JAGS(Yackulic et al.2020).

1.3 What are thedisadvantages?

Relative tounmarked,ubms has fewer typesof models available. For example, models that incorporate temporaryemigration (likegdistsamp)(Chandler, Royle, and King 2011) arecurrently not available inubms. Models should run muchfaster inunmarked as well. If you do not need one of thespecific benefits ofubms described above, it makes senseto stick withunmarked. Even if you do plan to useubms, it makes sense to test the models inunmarked first. The similar interface between the twopackages makes this very easy, as you will see in the next section.

Relative to BUGS/JAGS/Stan,ubms is less flexiblebecause you cannot customize your model structure. You are limited tothe provided model types. Furthermore, you cannot currently customizeprior distributions (although I plan to add this in the future in someform). Finally, writing your own BUGS/JAGS model can be valuable forgaining a deeper understanding of how a model works;ubms,likeunmarked, is essentially a black box.

To summarize the advantages and disadvantages: I seeubms as an intermediate step along the continuum fromunmarked to custom models in BUGS/JAGS. It is not meant toreplace either approach but rather to supplement them, for situationswhen a Bayesian framework is needed and “off-the-shelf” model structuresare adequate.

2 Example: Fitting asingle-season occupancy model

Occupancy models estimate the probability\(\psi\) that a species occupies a site,while accounting for detection probability\(p< 1\)(MacKenzie et al. 2002). In orderto estimate both\(p\) and\(\psi\), repeated observations(detection/non-detection data) at each site are required.

2.1 Set up the data

First, load the dataset we’ll be using, which comes withunmarked:

library(unmarked)data(crossbill)

Thecrossbill dataset is adata.frame withmany columns. It contains detection/non-detection data for the Europeancrossbill (Loxia curvirostra) in Switzerland(Schmid, Zbinden, and Keller2004).

dim(crossbill)
## [1] 267  58
names(crossbill)
##  [1] "id"      "ele"     "forest"  "surveys" "det991"  "det992"  "det993" ##  [8] "det001"  "det002"  "det003"  "det011"  "det012"  "det013"  "det021" ## [15] "det022"  "det023"  "det031"  "det032"  "det033"  "det041"  "det042" ## [22] "det043"  "det051"  "det052"  "det053"  "det061"  "det062"  "det063" ## [29] "det071"  "det072"  "det073"  "date991" "date992" "date993" "date001"## [36] "date002" "date003" "date011" "date012" "date013" "date021" "date022"## [43] "date023" "date031" "date032" "date033" "date041" "date042" "date043"## [50] "date051" "date052" "date053" "date061" "date062" "date063" "date071"## [57] "date072" "date073"

Check?crossbill for details about each column. Thefirst three columnsid,ele, andforest are site covariates.

site_covs<- crossbill[,c("id","ele","forest")]

The following 27 columns beginning withdet are thebinary detection/non-detection data; 9 years with 3 observations peryear. For this example we want to fit a single-season occupancy model;thus we will use only the first three columns (year 1) ofdet as our response variabley.

y<- crossbill[,c("det991","det992","det993")]head(y)
##   det991 det992 det993## 1      0      0      0## 2      0      0      0## 3     NA     NA     NA## 4      0      0      0## 5      0      0      0## 6     NA     NA     NA

Note that missing values are possible. The final 27 columns beginningwithdate are the Julian dates for each observation. Aswithy we want only the first three columns correspondingto year 1.

date<- crossbill[,c("date991","date992","date993")]

Finally, we build ourunmarkedFrame object holding ourdetection/non-detection data, site covariates, and observationcovariates. Since we will conduct a single-season occupancy analysis, weneed to useunmarkedFrameOccu specifically. The resultingunmarkedFrame can be used by bothunmarked andubms.

umf<-unmarkedFrameOccu(y=y,siteCovs=site_covs,obsCovs=list(date=date))head(umf)
## Data frame representation of unmarkedFrame object.##   y.1 y.2 y.3 id  ele forest date.1 date.2 date.3## 1   0   0   0  1  450      3     34     59     65## 2   0   0   0  2  450     21     17     33     65## 3  NA  NA  NA  3 1050     32     NA     NA     NA## 4   0   0   0  4  950      9     29     59     65## 5   0   0   0  5 1150     35     24     45     65## 6  NA  NA  NA  6  550      2     NA     NA     NA

2.2 Fit a model

2.2.1 Fit the model inunmarked

First, we fit a null model (no covariates) inunmarkedusing theoccu function. Theoccu functionrequires as input a double formula (for detection and occupancy,respectively) along with ourunmarkedFrame.

(fit_unm<-occu(~1~1,data=umf))
## ## Call:## occu(formula = ~1 ~ 1, data = umf)## ## Occupancy (logit-scale):##  Estimate    SE     z P(>|z|)##    -0.546 0.218 -2.51  0.0121## ## Detection (logit-scale):##  Estimate    SE     z P(>|z|)##    -0.594 0.208 -2.86 0.00426## ## AIC: 511.2538 ## Number of sites: 245## ID of sites removed due to NA: 3 6 38 63 71 106 116 118 125 126 152 163 168 178 181 197 199 212 218 220 238 257

2.2.2 Fit the model inubms

Next, we fit the same model inubms. The equivalent tooccu inubms isstan_occu.Functions inubms generally use thisstan_prefix, based on the approach used in packagerstanarm forGLMs. We need to provide the same arguments tostan_occu.In addition, we will specify that we want 3 MCMC chains(chains=3), with 500 iterations per chain(iter=500) of which the first half will be warmupiterations. It is beyond the scope of this vignette to discuss theappropriate number or length of chains; seetheStan user’s guide for more details. Generally 4 chains of 2000iterations each is recommended (of which 1000 per chain are warmups).Thus, 500 iterations per chain is probably not enough, but to keepthings running quickly it is sufficient for this vignette. Note that ifyou are more familiar with BUGS or JAGS, Stan generally requires asmaller number of iterations to reach convergence thanks to its defaultNUTS sampler(Hoffmanand Gelman 2014). If you have a good multi-core CPU, you canrun chains in parallel. Tell Stan how many parallel cores you want touse by assigning a value to thecores argument.

library(ubms)(fit_stan<-stan_occu(~1~1,data=umf,chains=3,iter=500,cores=3,seed=123))
## ## Call:## stan_occu(formula = ~1 ~ 1, data = umf, chains = 3, iter = 500, ##     refresh = 0, seed = 123)## ## Occupancy (logit-scale):##  Estimate    SD   2.5%   97.5% n_eff Rhat##    -0.483 0.234 -0.882 0.00981   236    1## ## Detection (logit-scale):##  Estimate    SD  2.5%  97.5% n_eff Rhat##    -0.642 0.226 -1.11 -0.229   285    1## ## LOOIC: 511.954## Runtime: 24.968 sec

2.2.3 Compareresults

The structure of the output fromunmarked andubms is intentionally similar. Estimates of the occupancyand detection parameters are also similar, but not identical. For a moredirect comparison, call thecoef function on both modelobjects:

cbind(unmarked=coef(fit_unm),stan=coef(fit_stan))
##            unmarked       stan## psi(Int) -0.5461203 -0.4834253## p(Int)   -0.5939612 -0.6419488

2.2.4 Understanding theubms output summary

Let’s look at the output from ourfit_stan modelagain:

fit_stan
## ## Call:## stan_occu(formula = ~1 ~ 1, data = umf, chains = 3, iter = 500, ##     refresh = 0, seed = 123)## ## Occupancy (logit-scale):##  Estimate    SD   2.5%   97.5% n_eff Rhat##    -0.483 0.234 -0.882 0.00981   236    1## ## Detection (logit-scale):##  Estimate    SD  2.5%  97.5% n_eff Rhat##    -0.642 0.226 -1.11 -0.229   285    1## ## LOOIC: 511.954## Runtime: 24.968 sec

The first part (underCall:) is the command we used toget this model output. Underneath are two tables, one per submodel,corresponding to the occupancy and detection parts of the model. Withineach table there is one row per parameter in the submodel. Sincefit_stan had no covariates, there is only an intercept termfor each submodel. Model parameters in this summary table are alwaysshown on the appropriate transformed scale, in this case logit. To getthe corresponding probabilities, you can use thepredictfunction, which we will demonstrate later.

For each parameter, the mean and standard deviation of the posteriordistribution are given. Unlikeunmarked output, there is no\(Z\) or\(p\)-value. Instead, there is a 95%uncertainty interval provided.

The final two columns in each summary tablen_eff andRhat are MCMC diagnostics. We will discuss their meaninglater.

2.2.5 Extractingindividual parameters

To extract summary values into an R table for easy manipulation, usethesummary method. Note that you have to specify whichsubmodel you want ("state" for occupancy or"det" for detection).

sum_tab<-summary(fit_stan,"state")sum_tab$mean[1]
## [1] -0.4834253

To extract the entire posterior for a parameter, use theextract method. To avoid name collisions you need to usethe full name of the parameter (which contains both the submodel and theparameter name) when extracting. To see a list of the available fullparameter names, use thenames method.

names(fit_stan)
## [1] "beta_state[(Intercept)]" "beta_det[(Intercept)]"
occ_intercept<-extract(fit_stan,"beta_state[(Intercept)]")[[1]]hist(occ_intercept,freq=FALSE)lines(density(occ_intercept),col='red',lwd=2)

2.3 Compare candidatemodels

Now we’ll fit two candidate models to thecrossbill datainubms and compare them.

2.3.1 Fit the models

Along with our previous null model, we’ll fit a “global” model withboth site and observation covariates. This is just an example; perhapsother models should also be considered if we were preparing thisanalysis for publication. In our model formulas, we have normalized allcovariates withscale so they have a mean of 0 and astandard deviation of 1. This can help improve model convergence and isgenerally a good idea.

fit_null<- fit_stanfit_global<-stan_occu(~scale(date)~scale(forest)+scale(ele),data=umf,chains=3,iter=500,seed=123)
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.## Running the chains for more iterations may help. See## https://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.## Running the chains for more iterations may help. See## https://mc-stan.org/misc/warnings.html#tail-ess

Thefit_global model gave us some warnings about theeffective sample size (n_eff) along with a suggestedsolution. We will ignore this warning for now but normally it is a goodidea to pay close attention to these warnings.

2.3.2 Compare themodels

First we combine the models into afitList:

mods<-fitList(fit_null, fit_global)

Then we generate a model selection table:

round(modSel(mods),3)
##                elpd nparam elpd_diff se_diff## fit_global -236.776  5.984     0.000   0.000## fit_null   -255.977  2.524   -19.201   6.618

Instead of AIC, models are compared using leave-one-outcross-validation (LOO)(Vehtari, Gelman, and Gabry 2017) viatheloo package. Based on this cross-validation, theexpected predictive accuracy (elpd) for each model iscalculated. The model with the largestelpd(fit_global) performed best. Theelpd_diffcolumn shows the difference inelpd between a model and thetop model; if this difference is several times larger than the standarderror of the difference (se_diff), we are confident themodel with the largerelpd performed better. We can seethat thefit_global model is clearly the best performingmodel.

You can obtain LOO information for a single model using theloo method:

loo(fit_global)
## ## Computed from 750 by 245 log-likelihood matrix.## ##          Estimate   SE## elpd_loo   -236.8 19.2## p_loo         6.0  0.6## looic       473.6 38.5## ------## MCSE of elpd_loo is 0.1.## MCSE and ESS estimates assume MCMC draws (r_eff in [0.3, 1.5]).## ## All Pareto k estimates are good (k < 0.65).## See help('pareto-k-diagnostic') for details.

Thelooic value is analogous to AIC.

You can also obtain the WAIC (Widely Applicable InformationCriterion) if you prefer(Vehtari, Gelman, and Gabry2017):

waic(fit_global)
## ## Computed from 750 by 245 log-likelihood matrix.## ##           Estimate   SE## elpd_waic   -236.7 19.2## p_waic         5.9  0.6## waic         473.5 38.5

2.4 Diagnostics and modelfit

We’ll define the global model as our top model:

fit_top<- fit_globalfit_top
## ## Call:## stan_occu(formula = ~scale(date) ~ scale(forest) + scale(ele), ##     data = umf, chains = 3, iter = 500, refresh = 0, seed = 123)## ## Occupancy (logit-scale):##               Estimate    SD    2.5% 97.5% n_eff Rhat## (Intercept)     -0.603 0.359 -1.1896 0.293   208 1.01## scale(forest)    1.077 0.325  0.5595 1.925   211 1.01## scale(ele)       0.566 0.245  0.0924 1.054   533 1.00## ## Detection (logit-scale):##             Estimate    SD   2.5%  97.5% n_eff Rhat## (Intercept)   -0.754 0.253 -1.290 -0.316   300 1.01## scale(date)    0.558 0.170  0.233  0.883  1030 1.00## ## LOOIC: 473.552## Runtime: 37.205 sec

2.4.1 MCMCdiagnostics

Again looking at the summary offit_top, we concludeMCMC chains have converged if all\(\hat{R}< 1.05\). To visualize convergence, look at thetraceplots:

traceplot(fit_top,pars=c("beta_state","beta_det"))

The traceplots look a little messy. We may also get a warning thatn_eff is lacking for some parameters. The rule of thumb isto haven_eff > 100 * number of chains (300). The easysolution to both problems would be to re-run this model with moreiterations.

2.4.2 Model fit

Calculating residuals is tricky for occupancy models. There isn’t onewidely accepted way of doing it.ubms implements theapproach of Wright(Wright, Irvine, and Higgs 2019) inwhich residuals are calculated separately for the state and observationprocesses. To quickly plot residuals against fitted values, use theplot method:

plot_residuals(fit_top,submodel="state")

Note that the residuals are automatically binned, which isappropriate for a binomial response(Gelman and Hill 2007). If the modelfits the data well, you would expect 95% of the binned residual pointsto fall within the shaded area. You can also plot residuals againstcovariate values using theplot_residuals function:

plot_residuals(fit_top,submodel="state",covariate="forest")

ubms also supports goodness-of-fit tests (posteriorpredictive checks) for some models. In the case of occupancy models,there is support for the MacKenzie-Bailey (M-B) chi-square test(MacKenzie and Bailey2004) via thegof function. For each posteriordraw, the M-B statistic is calculated for the actual data and for asimulated dataset. The proportion of draws for which the simulatedstatistic is larger than the actual statistic should be near 0.5 if themodel fits well.

(fit_top_gof<-gof(fit_top,draws=100,quiet=TRUE))
## MacKenzie-Bailey Chi-square ## Point estimate = 28.241## Posterior predictive p = 0.04
plot(fit_top_gof)

Our model does not appear to fit well based on this posteriorpredictive check. The first step to addressing this would be to run themodel for more iterations to make sure that isn’t the reason.

You can use theposterior_predict function to simulatenew datasets, which you can use to calculate your own fit statistic. Thefollowing command generates 100 simulated datasets.

sim_y<-posterior_predict(fit_top,"y",draws=100)dim(sim_y)
## [1] 100 801

The output is a matrix with dimensions draws x observations (insite-major order). As an example, we can calculate the proportion ofzeros in each simulated dataset

prop0<-apply(sim_y,1,function(x)mean(x==0,na.rm=TRUE))

and compare that to the proportion of zeros in the actualdataset.

actual_prop0<-mean(getY(fit_top)==0,na.rm=TRUE)#Comparehist(prop0,col='gray')abline(v=actual_prop0,col='red',lwd=2)

2.5 Model inference

2.5.1 Marginal covariateeffects

Based on the 95% uncertainty intervals, both forest and elevationhave a positive effect on occupancy probability (both intervals do notcontain 0). Similarly, Julian date has a positive impact on detectionprobability. We can quickly visualize these marginal covariate effectswith theplot_effects function:

plot_effects(fit_top,"state")

plot_effects(fit_top,"det")

2.5.2 Predict parametervalues

As withunmarked, we can get the predicted\(psi\) or\(p\) for each site or observation using thepredict function. For example, to get occupancy:

head(predict(fit_top,submodel="state"))
##    Predicted         SD       2.5%     97.5%## 1 0.08475341 0.04074847 0.02603972 0.1870306## 2 0.15341676 0.06590045 0.05564030 0.3312959## 3 0.30784613 0.07968393 0.18695527 0.5197789## 4 0.14593709 0.04629648 0.07153534 0.2537933## 5 0.35156491 0.08389258 0.22562494 0.5726406## 6 0.08783817 0.04008446 0.02912174 0.1868335

You can also supply newdata as adata.frame.

nd<-data.frame(ele=100,forest=25)predict(fit_top,submodel="state",newdata=nd)
##   Predicted         SD       2.5%     97.5%## 1 0.1401127 0.07725292 0.03806715 0.3547347

2.5.3 Simulate latentoccupancy states

One of the advantages of BUGS/JAGS is that you can directly modellatent parameters, such as the true unobserved occupancy state of a site\(z\). Using theposterior_predict function inubms, you cangenerate an equivalent posterior distribution of\(z\).

zpost<-posterior_predict(fit_top,"z",draws=100)dim(zpost)
## [1] 100 267

The output has one row per posterior draw, and one column per site.The posterior of\(z\) can be usefulfor post-hoc analyses. For example, suppose you wanted to test for adifference in mean occupancy probability between sites 1-50 and sites51-100:

group1<-rowMeans(zpost[,1:50],na.rm=TRUE)group2<-rowMeans(zpost[,51:100],na.rm=TRUE)plot_dat<-rbind(data.frame(group="group1",occ=mean(group1),lower=quantile(group1,0.025),upper=quantile(group1,0.975)),data.frame(group="group2",occ=mean(group2),lower=quantile(group2,0.025),upper=quantile(group2,0.975)))

Now plot the posterior distributions of the two means:

library(ggplot2)ggplot(plot_dat,aes(x=group,y=occ))+geom_errorbar(aes(ymin=lower,ymax=upper),width=0.2)+geom_point(size=3)+ylim(0,1)+labs(x="Group",y="Occupancy + 95% UI")+theme_bw()+theme(panel.grid.major=element_blank(),panel.grid.minor=element_blank(),axis.text=element_text(size=12),axis.title=element_text(size=14))

3 References

Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015.“Fitting Linear Mixed-Effects Models Using Lme4.”Journal of Statistical Software 67 (1).https://doi.org/10.18637/jss.v067.i01.
Carpenter, Bob, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, BenGoodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li,and Allen Riddell. 2017.“Stan: A Probabilistic ProgrammingLanguage.”Journal of Statistical Software 76 (1).https://doi.org/10.18637/jss.v076.i01.
Chandler, Richard B., J. Andrew Royle, and David I. King. 2011.“Inference about Density and Temporary Emigration in UnmarkedPopulations.”Ecology 92 (7): 1429–35.https://doi.org/10.1890/10-2433.1.
Fiske, Ian, Richard Chandler, and others. 2011.“Unmarked: AnR Package for Fitting Hierarchical Models of WildlifeOccurrence and Abundance.”Journal of StatisticalSoftware 43 (10): 1–23.https://doi.org/10.18637/jss.v043.i10.
Gelman, Andrew, and Jennifer Hill. 2007.Data Analysis UsingRegression and Multilevel/Hierarchical Models. New York, NY:Cambridge University Press.
Hoffman, Matthew D, and Andrew Gelman. 2014.“TheNo-U-Turn Sampler: AdaptivelySetting Path Lengths inHamiltonianMonteCarlo.”Journal of Machine LearningResearch 15 (1): 1593–623.
MacKenzie, Darryl I., and Larissa L. Bailey. 2004.“Assessing theFit of Site-Occupancy Models.”Journal of Agricultural,Biological, and Environmental Statistics 9 (3): 300–318.https://doi.org/10.1198/108571104x3361.
MacKenzie, Darryl I., James D. Nichols, Gideon B. Lachman, Sam Droege,J. Andrew Royle, and Catherine A. Langtimm. 2002.“Estimating SiteOccupancy Rates When Detection Probabilities Are Less Than One.”Ecology 83 (8): 2248–55.https://doi.org/10.1890/0012-9658(2002)083[2248:esorwd]2.0.co;2.
Schmid, Hans, Niklaus Zbinden, and Verena Keller. 2004.“Überwachung Der Bestandsentwicklung Häufiger Brutvögel in DerSchweiz.”Swiss Ornithological Institute SempachSwitzerland.
Vehtari, Aki, Andrew Gelman, and Jonah Gabry. 2017.“PracticalBayesian Model Evaluation Using Leave-One-OutCross-Validation andWAIC.”Statistics andComputing 27 (5): 1413–32.https://doi.org/10.1007/s11222-016-9696-4.
Wright, Wilson J., Kathryn M. Irvine, and Megan D. Higgs. 2019.“Identifying Occupancy Model Inadequacies: Can ResidualsSeparately Assess Detection and Presence?”Ecology,e02703.https://doi.org/10.1002/ecy.2703.
Yackulic, Charles B., Michael Dodrill, Maria Dzul, Jamie S. Sanderlin,and Janice A. Reid. 2020.“A Need for Speed inBayesian Population Models: A Practical Guide toMarginalizing and Recovering Discrete Latent States.”Ecological Applications 30 (5).https://doi.org/10.1002/eap.2112.

[8]ページ先頭

©2009-2025 Movatter.jp