Movatterモバイル変換


[0]ホーム

URL:


Eco-Stats – Code and Data Accompanying theEco-Stats Text

David Warton

2025-06-06

The Eco-Stats package contains functions and data supporting theEco-Stats text (Warton, 2022, Springer), and solutions to exercises.Functions include tools for using simulation envelopes in diagnosticplots, also applicable to multivariate linear models, and a parametricbootstrap function that can be used in place of ananovacall for hypothesis testing via simulation, for many types of regressionmodels. Datasets mentioned in the package are included here (where notavailable elsewhere) and vignettes work through code chunks andexercises from the textbook, one chapter at a time.

Simulation envelopes in plots

The commandplotenvelope will take a fitted object andconstruct standard residual plots with global envelopes added aroundpoints (for quantile plots) and around smoothers (for residual plots),constructed via simulation. These are constructed using theGET package as global envelopes, which is important forinterpretation – it means that when model assumptions are correct, 95%of quantile plot envelopes (at confidence level 95%) should contain thedata (or smoother) over thewhole plot. (Pointwise envelopeswould have been easier to construct, but are much harder to interpretbecause they don’t control the chance of missing the data globally,across the whole plot. A 95% pointwise envelope, constructed whenassumptions are satisfied, might for example misssome of thedata 60% of the time.)

plotenvelope will work on lots of different types offitted objects – pretty much anything that comes with asimulate function that behaves in a standard way. Asimulate.mlm andsimulate.manyglm functionshave been written in this package specifically so thatplotenvelope also works for multivariate models fittedusinglm ormvabund::manyglm.

library(ecostats)data(iris)Y=with(iris,cbind(Sepal.Length,Sepal.Width,Petal.Length,Petal.Width))iris.mlm=lm(Y~Species,data=iris)# check normality assumption:par(mfrow=c(1,2),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))plotenvelope(iris.mlm,n.sim=199)

Formlm objects, this function will compute conditionalresiduals and fitted values, that is, they are computed for eachresponse conditional on all other responses being observed, via thecpredict andcresiduals functions. This isdone because the full conditionals of a distribution are known to bediagnostic of joint distributions, hence any violation of themultivariate normality assumption will be expressed as a violation ofassumptions of these full conditional models. The full conditionals arewell-known to follow a linear model, as a function of all otherresponses as well as predictors.

Theqqenvelope function can be applied for a normalquantile plot, with global envelope, to either a fitted model or asample of data:

y=rnorm(20)qqenvelope(y)

anova tests using a parametric bootstrap

The commandanovaPB computes analysis of variance (ordeviance) tables for two fitted model objects, but with the\({P}\)-value estimated using a parametricbootstrap – by repeatedly simulating new responses from the fitted modelunder the null hypothesis. This will work on lots of different types offitted objects – likeplotenvelope, it should work onpretty much anything that comes with asimulate functionthat behaves in a standard way. These fitted models also need to respondto eitheranova orlogLik.

While the interface is written to be a lot likeanova,it requires two fitted objects to be specified – the first being a fitunder the null hypothesis, and the second being the fit under thealternative.

# generate random Poisson data and a predictor:y=rpois(50,lambda=1)x=1:50# fit a Poisson regressions with and without x:rpois_glm=glm(y~x,family=poisson())rpois_int=glm(y~1,family=poisson())# use the parametric bootstrap to test for an effect of x (will usually be non-significant)anovaPB(rpois_int,rpois_glm,n.sim=99,ncpus=1)#>#> Analysis of Deviance Table#>#> Model 1: y ~ 1#> Model 2: y ~ x#>   Resid. Df Resid. Dev Df Deviance Pr(>Chi)#> 1        49      69.41#> 2        48      69.18  1    0.223     0.63#>#> P-value calculated by simulating 99 samples from 1.

Datasets

All datasets used in the Eco-Stats text, where not availableelsewhere, are supplied in this package. For example:

data(aphids) cols=c(rgb(1,0,0,alpha=0.5),rgb(0,0,1,alpha=0.5))#transparent colourswith(aphids$oat,interaction.plot(Time,Plot,logcount,legend=FALSE,col=cols[Treatment],lty=1,ylab="Counts [log(y+1) scale]",xlab="Time (days since treatment)") )legend("bottomleft",c("Excluded","Present"),col=cols,lty=1)


[8]ページ先頭

©2009-2025 Movatter.jp