In Anthony’s revegetation study (Exercise 10.3), he classifiedanything that fell into his pitfall traps to Order, and thus counted theabundance of each of 24 invertebrate Orders across ten sites. He wantsto know: Is there evidence of a change in invertebrate communities dueto revegetation efforts?
What type of response variable(s) does he have? How shouldAnthony analyse his data?
He has a multivariate abundance dataset, with 24 correlated counts ofinvertebrates in different Orders. He needs to use some GLM approachthat can handle correlation and many responses, likemanyglm from themvabund package.
In Exercise 1.13, David and Alistair looked at invertebrateepifauna settling on algal beds (seaweed) with different levels ofisolation (0, 2, or 10 metre buffer) from each other, at two samplingtimes (5 and 10 weeks). They observed presence/absence patterns of 16different types of invertebrate (across 10 replicates). They would liketo know if there is any evidence of a difference in invertebratepresence/absence patterns with Distance of Isolation.
How should they analyse the data?
They have a multivariate abundance dataset, with 16 correlatedpresence/absence measurements of invertebrates. They need to use someGLM approach that can handle correlation and many responses, likemanyglm from themvabund package.
As in Exercise 10.2, Lena studied the effects of an offshore windfarm on fish communities, by collecting paired data before and afterwind farm construction, at 36 stations in each of three zones (windfarm, North, or South). She counted how many fish were caught at eachstation, classified into 16 different taxa. Lena wants to know if thereis any evidence of a change in fish communities at wind farm stations,as compared to others, following construction of the wind farm.
How should she analyse the data?
She has a multivariate abundance dataset, with 16 correlated countsof different fish species. She needs to use some GLM approach that canhandle correlation and many responses, likemanyglm fromthemvabund package.
library(ecostats)library(mvabund)data(reveg)reveg$abundMV=mvabund(reveg$abund)ft_reveg=manyglm(abundMV~treatment+offset(log(pitfalls)),family="negative.binomial",data=reveg)# offset included as in Exercise 10.9anova(ft_reveg)#> Time elapsed: 0 hr 0 min 7 sec#> Analysis of Deviance Table#>#> Model: abundMV ~ treatment + offset(log(pitfalls))#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> (Intercept) 9#> treatment 8 1 78.25 0.02 *#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 999 iterations via PIT-trap resampling.Consider theseaweed dataset from David’s andAlistair’s study of invertebrate epifauna settling on algal beds withdifferent levels of isolation (0, 2 or 10 metre buffer) at differentsampling times (5 and 10 weeks), with varying seaweed biomass in eachpatch.
What sort of model is appropriate for this dataset? Fit thismodel and call itft_epiAlt and runanova(ft_epiAlt). (This might take a couple of minutes torun.)
Previous experience suggests anything beyond second-orderinteractions does not seem to matter – you could fit more but it wouldtake longer. Because of the warning that this will take a long time torun I will setnBoot=99 for a quicker, rough answer:
data(seaweed)seaweed$Dist=as.factor(seaweed$Dist)# set up presence-absence response:seaweed$PA=mvabund(seaweed[,6:21])seaweed$PA[seaweed$PA>1]=1#fit modelft_epiAlt=manyglm(PA~(Wmass+Size+Time)*Dist,family="cloglog",data=seaweed)anova(ft_epiAlt,nBoot=99)#> Time elapsed: 0 hr 0 min 13 sec#> Analysis of Deviance Table#>#> Model: PA ~ (Wmass + Size + Time) * Dist#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> (Intercept) 56#> Wmass 55 1 6.66 0.93#> Size 54 1 36.98 0.02 *#> Time 53 1 32.35 0.04 *#> Dist 51 2 28.26 0.79#> Wmass:Dist 49 2 33.29 0.25#> Size:Dist 47 2 26.77 0.10 .#> Time:Dist 45 2 32.56 0.07 .#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 99 iterations via PIT-trap resampling.Now fit a model under the null hypothesis that there is no effectof Distance of isolation, and call itft_epiNull. Runanova(ft_epiNull, ft_epiAlt). This second anova took muchless time to fit – why? Is there evidence of an effect of distance ofisolation on presence/absence patterns in the invertebratecommunity?
To see if there is an effect ofDist I guess we couldremove it from the model entirely and see if that does anything:
ft_epiNull=manyglm(PA~Wmass+Size+Time,family="cloglog",data=seaweed)anova(ft_epiNull, ft_epiAlt,nBoot=99)#> Time elapsed: 0 hr 0 min 2 sec#> Analysis of Deviance Table#>#> ft_epiNull: PA ~ Wmass + Size + Time#> ft_epiAlt: PA ~ (Wmass + Size + Time) * Dist#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> ft_epiNull 53#> ft_epiAlt 45 8 115.2 0.41#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 99 iterations via PIT-trap resampling.This secondanova took much less time to fit –why?
Because it only has to test one hypothesis, not seven, so took abouta seventh of the time!
Is there evidence of an effect of distance of isolation onpresence/absence patterns in the invertebrate community?
No :(
par(mfrow=c(1,3),mar=c(3,3,2,1),mgp=c(1.75,0.75,0))ft_reveg=manyglm(abundMV~treatment,offset=log(pitfalls),family="negative.binomial",data=reveg)plotenvelope(ft_reveg,which=1:3)What do you reckon?
Assumptions seem reasonable here.
ft_revegP=manyglm(abundMV~treatment,offset=log(pitfalls),family="poisson",data=reveg)par(mfrow=c(1,3),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))plotenvelope(ft_revegP,which=1:3,sim.method="stand.norm")meanvar.plot(reveg$abundMV~reveg$treatment)#> START SECTION 2#> Plotting if overlay is TRUE#> using grouping variable reveg$treatment 7 mean values were 0 and could#> not be included in the log-plot#> using grouping variable reveg$treatment 10 variance values were 0 and could not#> be included in the log-plot#> FINISHED SECTION 2abline(a=0,b=1,col="darkgreen")How’s the Poisson assumption looking?
It is in all sorts of trouble, there is a very strong fan-shape onthe residuals vs fits plot, points drift well outside the simulationenvelope on the nromal quantile plot (with many residuals larger than 5or less than -5) and a strong increasing trend on the residual vs fitsplot. These are all symptomatic of overdispersion, confirmed by thesample mean-variance plot, where points tend to fall above theone-to-one line at large means.
What assumptions were made?
We assumed:
Where possible, check these assumptions.
par(mfrow=c(1,3),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))ft_epiAlt=manyglm(PA~(Wmass+Size+Time)*Dist,family=binomial("cloglog"),data=seaweed)plotenvelope(ft_epiAlt,which=1:3)Do assumptions seem reasonable?
Yes it all looks good to me. This is not unexpected becausepresence-absence data are by definition Bernoulli, the only potentialissues could be if there were higher order interactions or violations ofte independence assumption.
Consider Lena’s offshore wind farm study (Exercise 14.3). Fit anappropriate model to the data. Make sure you include a Station maineffect (to account for the paired sampling design).
data(windFarms)ft_wind=manyglm(mvabund(windFarms$abund)~Station+Year+Year:Zone,family="poisson",data=windFarms$X)What assumptions were made?
We assumed:
Where possible, check these assumptions.
Used “stand.norm” because this model takes a while to fit.
par(mfrow=c(1,3),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))plotenvelope(ft_wind,which=1:3,sim.method="stand.norm")Do assumptions seem reasonable? In particular, think aboutwhether there is evidence that the counts are overdispersed compared tothe Poisson.
Things generally look OK, but there is a downward trend on thescale-location plot, weakly suggestive of under-dispersion. Note the simenvelope was constructed using"stand.norm", meaning wecompared the smoother to what would be expected for a bunch of standardnormal residuals. But the kink could be a weird artifact of the modelbeing fitted (which has loads of parameters) so let’s repeat using"refit", but with a smaller value ofn.sim soit doesn’t take too long (it will take a few minutes though):
par(mfrow=c(1,3),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))plotenvelope(ft_wind,which=1:3,n.sim=59)#> Error in glm.fit(x = structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, :#> NA/NaN/Inf in 'x'(You can ignore any error messages - these seem to happen becausepredicted values were numerically zero.)
Anyway, the downward trend in the scale-location plot happened insimulated data too, so it is nothing really to worry about. It seems tobe an artifact, probably because of overfitting of the data by includingaStation term in the model – this means that there is aparameter for every two observations in the data! (If a random effectwere used instead, this effect would have been weaker. But it isn’treally a problem.)
anova(ft_reveg,test="wald",cor.type="shrink")#> Time elapsed: 0 hr 0 min 6 sec#> Analysis of Variance Table#>#> Model: abundMV ~ treatment#>#> Multivariate test:#> Res.Df Df.diff wald Pr(>wald)#> (Intercept) 9#> treatment 8 1 8.698 0.05 *#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments:#> Test statistics calculated assuming correlated response via ridge regularization#> P-value calculated using 999 iterations via PIT-trap resampling.Fit models [to Lena’s windfarm data] under the null andalternative hypotheses of interest. Run an anova to compare these twomodels, with just 19 bootstrap resamples, to estimate computationtime.
The null model in this case has noYear:Zone term… therecould still be year-to-year variation, but it should not affectdifferent Zones in different ways.
windMV=mvabund(windFarms$abund)ft_wind=manyglm(windMV~Station+Year+Year:Zone,family="poisson",data=windFarms$X)ft_windNull=manyglm(windMV~Station+Year,family="poisson",data=windFarms$X)anova(ft_windNull, ft_wind,nBoot=19)#> Time elapsed: 0 hr 0 min 11 sec#> Analysis of Deviance Table#>#> ft_windNull: windMV ~ Station + Year#> ft_wind: windMV ~ Station + Year + Year:Zone#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> ft_windNull 70#> ft_wind 68 2 81.83 0.05 *#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 19 iterations via PIT-trap resampling.This took 12 seconds for me.
Remove zerotons and singletons from the dataset using:windMV = mvabund( windFarms$abund[,colSums(windFarms$abund>0)>1] )Now fit a model to this new response variable, again with just 19bootstrap resamples.
windMV1=mvabund(windFarms$abund[,colSums(windFarms$abund>0)>1])ft_wind1=manyglm(windMV1~Station+Year+Year:Zone,family="poisson",data=windFarms$X)ft_windNull1=manyglm(windMV1~Station+Year,family="poisson",data=windFarms$X)anova(ft_windNull1, ft_wind1,nBoot=19)#> Time elapsed: 0 hr 0 min 11 sec#> Analysis of Deviance Table#>#> ft_windNull1: windMV1 ~ Station + Year#> ft_wind1: windMV1 ~ Station + Year + Year:Zone#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> ft_windNull1 70#> ft_wind1 68 2 81.83 0.05 *#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 19 iterations via PIT-trap resampling.Did this run take less time? Yes, but not much less, it took10 seconds for me.
How do results compare? Results were almost identical, withthe same test statistic (to two decimal places).
How long do you think it would take to fit a model with 999bootstrap resamples, for an accurate 𝑃-value?
As a rough estimate, multiply the time you just got by 50. For methis is about 500 seconds, so I would expect it to take about eightminutes.
It is curious that this didn’t change computation time all that much.Looking at number of non-zero values inwindMV:
colSums(windFarms$abund>0)#> Bergvar Oxsimpa Piggvar Roodspotta Rootsimpa#> 0 51 1 3 30#> Sandskaadda Sjurygg Skrubbskaadda Skaaggsimpa Skaarsnultra#> 5 6 22 6 4#> Stensnultra Svart.smoorbult Torsk Tanglake Aakta.tunga#> 37 3 134 136 0#> AL#> 61we see that only three of the 16 species had one or less presence. Acouple more had three counts, removing these as well:
windMV3=mvabund(windFarms$abund[,colSums(windFarms$abund>0)>3])ft_wind3=manyglm(windMV3~Station+Year+Year:Zone,family="poisson",data=windFarms$X)ft_windNull3=manyglm(windMV3~Station+Year,family="poisson",data=windFarms$X)anova(ft_windNull3, ft_wind3,nBoot=19)#> Time elapsed: 0 hr 0 min 10 sec#> Analysis of Deviance Table#>#> ft_windNull3: windMV3 ~ Station + Year#> ft_wind3: windMV3 ~ Station + Year + Year:Zone#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> ft_windNull3 70#> ft_wind3 68 2 74.19 0.05 *#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 19 iterations via PIT-trap resampling.This is only slightly quicker, which is a tad surprising. The teststatistic is only slightly smaller, so we can see clearly that theaction is happening for the more abundant species – this is notsurprising, there is not enough information in any of the rare speciesto detect an interaction!
habOrd= counts=as.matrix(round(seaweed[,6:21]*seaweed$Wmass)) habOrd[counts>0& counts<10]=1habOrd[counts>=10]=2library(ordinal)#>#> Attaching package: 'ordinal'#> The following objects are masked from 'package:VGAM':#>#> dgumbel, dlgamma, pgumbel, plgamma, qgumbel, rgumbel, wine#> The following objects are masked from 'package:glmmTMB':#>#> ranef, VarCorr#> The following objects are masked from 'package:nlme':#>#> ranef, VarCorr#> The following objects are masked from 'package:lme4':#>#> ranef, VarCorr#> The following object is masked from 'package:dplyr':#>#> slicesummary(habOrd)# Amphipods are all "2" which would return an error in clm#> Amph Cope Poly Anem Iso Bival#> Min. :2 Min. :0.000 Min. :0.000 Min. :0.0000 Min. :0.000 Min. :0.000#> 1st Qu.:2 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:0.0000 1st Qu.:2.000 1st Qu.:0.000#> Median :2 Median :2.000 Median :2.000 Median :0.0000 Median :2.000 Median :2.000#> Mean :2 Mean :1.895 Mean :1.772 Mean :0.2456 Mean :1.877 Mean :1.351#> 3rd Qu.:2 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:0.0000 3rd Qu.:2.000 3rd Qu.:2.000#> Max. :2 Max. :2.000 Max. :2.000 Max. :2.0000 Max. :2.000 Max. :2.000#> Gast Turb Prawn Urchin Fish#> Min. :1.000 Min. :0.00000 Min. :0.0000 Min. :0.00000 Min. :0.0000#> 1st Qu.:2.000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000#> Median :2.000 Median :0.00000 Median :0.0000 Median :0.00000 Median :0.0000#> Mean :1.947 Mean :0.08772 Mean :0.5263 Mean :0.07018 Mean :0.1754#> 3rd Qu.:2.000 3rd Qu.:0.00000 3rd Qu.:2.0000 3rd Qu.:0.00000 3rd Qu.:0.0000#> Max. :2.000 Max. :2.00000 Max. :2.0000 Max. :2.00000 Max. :2.0000#> Crab Caddis Opi Ost Bstar#> Min. :0.0000 Min. :0.00000 Min. :0.0000 Min. :0.000 Min. :0.0000#> 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000#> Median :0.0000 Median :0.00000 Median :0.0000 Median :2.000 Median :0.0000#> Mean :0.5965 Mean :0.03509 Mean :0.1404 Mean :1.404 Mean :0.2105#> 3rd Qu.:2.0000 3rd Qu.:0.00000 3rd Qu.:0.0000 3rd Qu.:2.000 3rd Qu.:0.0000#> Max. :2.0000 Max. :2.00000 Max. :2.0000 Max. :2.000 Max. :2.0000habOrd=habOrd[,-1]#remove AmphipodsmanyOrd=manyany(habOrd~Dist*Time*Size,"clm",data=seaweed)manyOrdNull=manyany(habOrd~Time*Size,"clm",data=seaweed)anova(manyOrdNull, manyOrd)#>#> LR Pr(>LR)#> sum-of-LR 101.1 0.17What hypothesis has been tested here? Is there any evidenceagainst it?
We tested for an effect of distance on abundance, when classifiedinto a three-level ordinal factor. There is no evidence of aDist effect.
Theordinal package has a bug in it (in version 2019.12)so it conflicts withlme4 (specifically it overwrites theranef function), [issue posted on Github] (https://github.com/runehaubo/ordinal/issues/48). So ifyou are running analyses using both packages, you need todetach theordinal package before continuing…
data
ft_comp=manyglm(abundMV~treatment+offset(log(pitfalls)),data=reveg,composition=TRUE)anova(ft_comp,nBoot=99)#> Time elapsed: 0 hr 0 min 21 sec#> Analysis of Deviance Table#>#> Model: abundMV ~ cols + treatment + offset(log(pitfalls)) + rows + cols:(treatment + offset(log(pitfalls)))#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> (Intercept) 239#> cols 216 23 361.2 0.01 **#> treatment 215 1 14.1 0.01 **#> rows 206 9 25.5 0.02 *#> cols:treatment 184 23 56.7 0.01 **#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments: P-value calculated using 99 iterations via PIT-trap resampling.Which term measures the effect of treatment on relativeabundance? Is there evidence of an effect on relativeabundance?
cols:treatment. There is evidence of an effect ofrelative abundance.
ft_null=manyglm(abundMV~cols+rows+offset(log(pitfalls)),data=ft_comp$data)ft_alt=manyglm(abundMV~cols+rows+treatment:cols+offset(log(pitfalls)),data=ft_comp$data)anova(ft_null,ft_alt,nBoot=99,block=ft_comp$rows)#> Time elapsed: 0 hr 0 min 5 sec#> Analysis of Deviance Table#>#> ft_null: abundMV ~ cols + rows + offset(log(pitfalls))#> ft_alt: abundMV ~ cols + rows + treatment:cols + offset(log(pitfalls))#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> ft_null 207#> ft_alt 184 23 56.74 0.01 **#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments: P-value calculated using 99 iterations via PIT-trap resampling.ft_reveg0=manyglm(abundMV~1+offset(log(pitfalls)),data=reveg)QDrows0=log(rowSums(reveg$abundMV))-log(rowSums(fitted(ft_reveg0)) )ft_row0=manyglm(abundMV~1+offset(log(pitfalls))+offset(QDrows0),data=reveg)ft_reveg=manyglm(abundMV~treatment+offset(log(pitfalls)),data=reveg)QDrows=log(rowSums(reveg$abundMV))-log(rowSums(fitted(ft_reveg)) )ft_row=manyglm(abundMV~treatment+offset(log(pitfalls))+offset(QDrows),data=reveg)anova(ft_row0,ft_row)#> Time elapsed: 0 hr 0 min 7 sec#> Analysis of Deviance Table#>#> ft_row0: abundMV ~ 1 + offset(log(pitfalls)) + offset(QDrows0)#> ft_row: abundMV ~ treatment + offset(log(pitfalls)) + offset(QDrows)#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> ft_row0 9#> ft_row 8 1 50.26 0.073 .#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 999 iterations via PIT-trap resampling.This was over ten times quicker than Code Box 14.7 (note that itused ten times as many resamples), but results are slightly different –the test statistic is slightly smaller, and the\(P\)-value larger. Why do you think thismight be the case?
The quick-and-dirty offset approach only roughly approximates the roweffect, so it would be expected to lose some efficiency (leading to asmaller test statistic and a larger\(P\)-value).
an_reveg=anova(ft_reveg,p.uni="adjusted")#> Time elapsed: 0 hr 0 min 6 secan_reveg#> Analysis of Deviance Table#>#> Model: abundMV ~ treatment + offset(log(pitfalls))#>#> Multivariate test:#> Res.Df Df.diff Dev Pr(>Dev)#> (Intercept) 9#> treatment 8 1 78.25 0.022 *#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#>#> Univariate Tests:#> Acarina Amphipoda Araneae Blattodea#> Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev)#> (Intercept)#> treatment 8.538 0.183 9.363 0.155 0.493 0.976 10.679 0.118#> Coleoptera Collembola Dermaptera Diotocardia#> Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev)#> (Intercept)#> treatment 9.741 0.143 6.786 0.297 0.196 0.976 0 0.983#> Diplura Diptera Formicidae Haplotaxida#> Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev)#> (Intercept)#> treatment 2.24 0.840 5.93 0.348 0.831 0.973 2.889 0.766#> Hemiptera Hymenoptera Isopoda Larvae#> Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev)#> (Intercept)#> treatment 1.302 0.967 4.254 0.528 1.096 0.973 0.463 0.976#> Lepidoptera Polydesmida Pseudoscorpionida#> Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev)#> (Intercept)#> treatment 0.913 0.973 1.451 0.957 1.056 0.973#> Scolopendrida Seolifera Soleolifera Thysanoptera#> Dev Pr(>Dev) Dev Pr(>Dev) Dev Pr(>Dev) Dev#> (Intercept)#> treatment 0.913 0.973 1.03 0.973 4.223 0.528 1.056#> Tricladida#> Pr(>Dev) Dev Pr(>Dev)#> (Intercept)#> treatment 0.973 2.804 0.766#> Arguments:#> Test statistics calculated assuming uncorrelated response (for faster computation)#> P-value calculated using 999 iterations via PIT-trap resampling.sortedRevegStats=sort(an_reveg$uni.test[2,],decreasing=T,index.return=T)sortedRevegStats$x[1:5]#> Blattodea Coleoptera Amphipoda Acarina Collembola#> 10.679374 9.741038 9.362519 8.537903 6.785946sum(sortedRevegStats$x[1:5])/an_reveg$table[2,3]#> [1] 0.5764636coef(ft_reveg)[,sortedRevegStats$ix[1:5]]#> Blattodea Coleoptera Amphipoda Acarina Collembola#> (Intercept) -0.3566749 -1.609438 -16.42495 1.064711 5.056246#> treatmentImpact -3.3068867 5.009950 19.42990 2.518570 2.045361ft_reveg$stderr[,sortedRevegStats$ix[1:5]]#> Blattodea Coleoptera Amphipoda Acarina Collembola#> (Intercept) 0.3779645 1.004969 707.1068 0.5171539 0.4879159#> treatmentImpact 1.0690450 1.066918 707.1069 0.5713194 0.5453801Which fish species are most strongly associated with offshorewind farms, in Lena’s study? Reanalyse the data to obtain univariatetest statistics and univariate\(P\)-values that have been adjusted formultiple testing.
data(windFarms)windMV1=mvabund(windFarms$abund[,colSums(windFarms$abund>0)>1])ft_wind1=manyglm(windMV1~Station+Year+Year:Zone,family="poisson",data=windFarms$X)an_wind1=anova(ft_wind1,p.uni="adjusted",nBoot=99)#> Time elapsed: 0 hr 2 min 48 secsortedWindStats=sort(an_wind1$uni.test[4,],decreasing=T,index.return=T)sortedWindStats$x[1:5]#> Tanglake Torsk Oxsimpa Rootsimpa AL#> 29.019951 22.296044 6.899955 6.801339 4.819447an_wind1$uni.p[4,sortedWindStats$ix[1:5]]#> Tanglake Torsk Oxsimpa Rootsimpa AL#> 0.01 0.01 0.15 0.15 0.23Is there evidence that any species clearly have a Zone:Yearinteraction, after adjusting for multiple testing?
Yes it looks like we have evidence of an effect inTanglake (Viviparous Eelpout) andTorsk(cod).
What proportion of the totalZone:Year effect isattributable to these potential indicator species?
(These code chunk has not been run as it is reliant on theprevious code chunk.)
So 63% of theZone:Year effect is just these twospecies.
Plot the abundance of each potential indicator species againstZone and Year.
What is the nature of the wind farm effect for each species? Doyou think these species are good indicators of an effect of windfarms?
It is hard to say. I guess there has been a big increase in Ellpoutin 2010, perhaps more so in wind farms than in North Zone. Cod numbersseemed to increase in 2010 for South Zone and not wind farms.Interaction plots might be a better way to see this…